Patch-ID# 116468-04 Keywords: disk queue rdc remote mirror logical host sndr logging Synopsis: Availability Suite 3.2 SNDR Patch Date: Sep/02/2004 Install Requirements: Install in Single User Mode Reboot after installation Solaris Release: 8 9 SunOS Release: 5.8 5.9 Unbundled Product: Sun StorEdge Availability Suite Unbundled Release: 3.2 Xref: Topic: Relevant Architectures: sparc NOTE: After applying patch 116468-03 on both primary and secondary servers and rebooting, you must perform a full synchronization on all Availability Suite Remote Mirror asynchronous sets to ensure the data on the secondary volumes is consistent with the primary data volumes. For instructions to perform a full synchronization (sndradm -m) refer to Sun StorEdge Availability Suite 3.2 Remote Mirror Software Administration and Operations Guide (817-2784-10). For configurations where network latency and dataset size make a full synchronization prohibitive, the secondary may be synchronized with the primary via the tape based backup/restore coupled with an sndradm -E. NOTE: Problem Statement: In a Sun Cluster OE, when using Remote Mirror in combination with a Point-in-Time Copy to establish a ndr_ii pair for use during auto synchronization, the Point-in-Time Copy set should be preenabled by the system administrator, verses dynamically enabled by the SNDR auto-synchronization daemon. Failure to do so may cause the SNDR configured, Sun Cluster resource group to hang during failover processing. Please see BugId:5094206 or SRDB:77917 for detailed description Resolution: To prevent the Sun Cluster resource group hang, the Point-in-Time Copy set that is to be used by the SNDR synchronization daemon needs to be pre-enabled prior to turning on SNDR`s auto-synchronization (sndradm -a on) and enabling an SNDR ndr_ii pair (sndradm -I a ....). Repair: If an existing Sun Cluster configuration containing an SNDR light-weight resource group, with an ndr_ii pair appears to be hung, the Solaris processing running the following script needs to be identified and terminated. /usr/opt/SUNWesm/cluster/sbin/reconfig BugId's fixed with this patch: 4892753 4914957 4930424 4938202 4940318 4942385 4942997 4943413 4950370 4950802 4952176 4952178 4952920 4957445 4962068 4967629 4970042 4974911 4976889 4977645 4981223 4993281 4995602 4997398 5000951 5004765 5007944 5009144 5010349 5013414 5013757 5014238 5014239 5015987 5018806 5022892 5027558 5034369 5037654 5038271 5038552 5040685 5041365 5049952 5050438 5077630 Changes incorporated in this version: 4976889 5022892 5027558 5034369 5037654 5040685 5049952 5050438 5077630 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: 116466-02 or greater Obsoleted by: Files included with this patch: /usr/kernel/drv/rdc-5.8 /usr/kernel/drv/rdc-5.9 /usr/kernel/drv/sparcv9/rdc-5.8 /usr/kernel/drv/sparcv9/rdc-5.9 /usr/kernel/misc/rdcsrv-5.8 /usr/kernel/misc/rdcsrv-5.9 /usr/kernel/misc/sparcv9/rdcsrv-5.8 /usr/kernel/misc/sparcv9/rdcsrv-5.9 /usr/lib/mdb/kvm/rdc.so /usr/lib/mdb/kvm/sparcv9/rdc.so /usr/opt/SUNWesm/SUNWrdc/man/man1rdc/sndradm.1m /usr/opt/SUNWesm/SUNWrdc/sbin/sndradm /usr/opt/SUNWesm/SUNWrdc/sbin/sndrboot /usr/opt/SUNWrdc/lib/sndrd-5.8 /usr/opt/SUNWrdc/lib/sndrd-5.9 /usr/opt/SUNWrdc/lib/sndrsyncd /usr/opt/SUNWscm/lib/librdc.so.1-5.8 /usr/opt/SUNWscm/lib/librdc.so.1-5.9 Problem Description: 4976889 unable to delete SNDR set when logical host can't be found 5022892 enhance sndradm ds.log entries for TUNABLES and HEALTH 5027558 sndradm man page missing -R r (role reverse) usage and description 5034369 sndradm (-u) (-m) entries missing from ds.log 5037654 sndr dropped into logging with almost empty queue 5040685 deleting an ndr_ii config entry via sndradm -I d is not recorded in ds.log file 5049952 sndradm -h set: usage statement missing diskq parameter 5050438 sndradm -C not checking validity of cluster tag when adding disk queue to set 5077630 Deadlocks when {sndr/ii/sv}adm and {sndr/ii/sv}boot are invoked in Sun Cluster (from 116468-03) 4940318 Add logic to support the use of aliases for host or logical host 5010349 sndr bitmaps in one to many not getting updated 5013414 failed enable of a sync set with a disk q not atomic 5013757 diskq block/noblock operations not reported in ds.log 5014238 sndr should dump diskq if queue is full + link down 5014239 sndradm man page needs info on queuing state 5015987 update sync of async sets can drop network writes leaving secondary out of sync 5018806 cmn_err() needed when ref count is maxed out 5038271 diskq failure causes application to hang 5038552 disk queue not getting written when queueing 5041365 SNDR 3.2 Unit tests fail (GroupOrderedWrites) (from 116468-02) 4914957 lock contention for disk queues limit performance 4930424 enabling sndr with a diskqueue of 1TB or greater should fail 4938202 sndradm can be very slow when enabling more than 1500 RM sets 4942385 Long volume names cause warning messages to be cut off 4942997 sndr: sndradm unknown host:vol printed in ds.log 4943413 cluster failover during reverse sync makes mounted volume unusable 4950370 sndradm -A #threads[sndr-set] fails to report # to /var/opt/SUNWesm/ds.log 4950802 sndr bitmap count does not show that bits are set until sync or reboot 4952178 misleading disable message on timeout 4952176 iokstats broken 4952920 NHAS bitmap api can panic with 8k bitmaps 4957445 r_net_writeN should negative ack if secondary is logging 4967629 rdc_error_str is local, should be global 4970042 BAD TRAP: panic AVS 3.2 patch testing 4974911 sndradm help output missing a space for diskq removal 4977645 sndradm -e fails on 2'nd logical host 4981223 sndr async mode with many sets sharing a disk queue eats up cpu 4892753 flusher get stuck with diskq set to blocking mode and heavy I/O 4993281 Availability Suite 3.2 using sndr causes system hangs 4995602 double dec in _rdc_remote_flush() can access freed mem 4997398 failure removing diskq from multiple resource groups in SunCluster 5004765 writes to RM vols with full diskq causes incoming threads to be block 5000951 _rdc_async_throttle needs to print disk queue full message 5007944 Data replciation on middle hop of multihop config fails due to overlapping i/o 5009144 one to many with diskq and memory queue may not queue (from 116468-01) 4962068 disk queue upgrade results in 'WARNING: disk queue alloc failed(28)' Patch Installation Instructions: ----------------------------- Since this patch updates modules that live in the kernel, it is necessary for the user to boot the system up in single user mode to apply the patch and then reboot the system. Special Install Instructions: ---------------------------------------- None. README -- Last modified date: Thursday, September 2, 2004