Patch-ID# 109115-17 Keywords: t3 t300 t310 raid storage firmware loopcard eprom interconnect Synopsis: T3 1.18.04: System Firmware Update Date: Sep/22/2004 Install Requirements: Additional instructions may be listed below Solaris Release: 2.6 7 8 9 SunOS Release: 5.6 5.7 5.8 5.9 Unbundled Product: T3 Firmware Unbundled Release: 1.18.04 Xref: Topic: Relevant Architectures: sparc NOTE: This patch release is for the T3 Array only and not the T3+ Array. If you have a T3+ Array, and are looking to update with the latest patch release, please download patch 112276-09. The Sun StorEdge T3 disk tray was formerly known as the Sun StorEdge T300 prior to final product shipment. Most of the Sun StorEdge T3 disk tray user documentation has been updated to reflect the new name; however, there are some related software components (such as the Sun StorTools diagnostic package) that still reference this product as the Sun StorEdge T300. Users should be aware that both the Sun StorEdge T3 and Sun StorEdge T300 names refer to the same product and are equivalent in terms of product features and functionality. BugId's fixed with this patch: Changes incorporated in this version: Patches accumulated and obsoleted by this patch: 110760-02 111175-02 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: T3extender.tar.Z # script to extends battery life span to 36 months disk/ # T3 disk drive firmware disk/CHANGELISTS # T3 disk drive firmware release note docs/ # Release Notes, pdf manuals ep2_10.bin # Controller EPROM Flash files.tar # T310 System Files Tar Image lpc_05.02 # Unit Interconnect Card Firmware patchtoc # t3.sh: Files to Upload To Target T3 previous/ # Recent T3 Firmware Versions t3.sh # T3 File Upload Utility v1184.bin # Controller firmware README.109115-17 # README file for the patch Problem Description: --------------------------------------------------- 109115-17 (1.18.04 Firmware) --------------------------------------------------- 4843539 vol verify in 118.2 does not ask to verify and continue plus no message in log. 4938052 T3A does not report vol verify start and end in syslog like the T3B 4938889 vol verify confuses people by reporting parity mismatch for media error (3/11) 4957659 T3A vol verify run from web interface (storade) does not show in cmdlog or syslog 4957717 T3A vol verify differences when run from web interface vs cli. 4922403 T3 patch readme should include drive fw bug fixes 4974618 T3B/313_2: syslog data prints to console instead of log file 4916219 T3A/T3B: Data miscompare while running test on Pirus boxes with T3A PP 4624156 T3B can cause data corruption during failovers 5038486 T3B/3.1.4.11: data miscompare - 128 words in error (0x0) 4850100 T3A/T3B/6120 READ CAPACITY opcode (0x25) goes through the wrong mpxio path during boot 4970431 "Rush Limbaugh" & "High atop the EIB building" should be removed as defaults 4810817 Backport fix http tokens fails to report standby when 2 vols share standby --------------------------------------------------- 109115-16 (1.18.03 Firmware) --------------------------------------------------- latest disk FW updates --------------------------------------------------- 109115-15 (1.18.03 Firmware) --------------------------------------------------- 4844948 T3A/T3B/T4:disk management command typo drops master controller in T3 4899498 T3A/T3B/T4: loss of nvram data results in default settings for logto and loglevel 4848907 T3A: syslog"qlcf_i_read_response: Debug Code..." shoud give more explicit logic 4733406 T3 battery errors with "SHELF LIFE EXPIRED" even after PCU replacement 4881214 Wrong size drive inserted into t3 caused u2 controller to fail see 4312061 4808154 Firmware 1.18.1, version string not indicating patch level via http interface 4818207 T3A/B: background reconstruction doesn't report any info about completion 4869992 Need command to re-enable drive from offline (threshold exceeded) to online 4832135 Bad block correction and predictive drive failing mechanisms are insufficient 4810681 T3B: disk download FW A538 fails t3B 4836800 Enhance T3 FW to inject 01/5D errors for test purpose 4889551 6120/T3B: syslog (INFO) shows wrong Cmd code when doing LongTransacTask operatio --------------------------------------------------- 109115-14 (1.18.02 Firmware) --------------------------------------------------- latest disk FW updates --------------------------------------------------- 109115-13 (1.18.02 Firmware) --------------------------------------------------- 4823585 T3 FW version 1.18.02 keeps generating LIP's 4821006 T3 FW 1.18.02 loglevel does not work correctly 4785593 T3/T4: array may crash after 497 days of uptime. 4729107 upgrade from 1.16 to 1.18 - after reset -y the logto and loglevel change 4683597 Typo in schd.conf, causes continuous battery refreshes. 4388177 Need to hardcode Battery Scheduler Refresh Times --------------------------------------------------- 109115-12 (1.18.01 Firmware) --------------------------------------------------- 4707617 Unrecovered Read errors during 'vol verify fix' operation not corrected 4505867 T3: data miscompare error while running madrw 4648055 Raid 5 'vol disable uxdx to_standby' fails 4664674 Raid 1 'vol disable uxdx to_standby' fails 4675955 Data miscompared in T3 running Madrw with Fujitsu disk drives 4697868 Disk in raid 5 on T3+ failed and Oracle database crashed in clustered config --------------------------------------------------- 109115-11 (1.18.00 Firmware) --------------------------------------------------- This patch includes updated drive firmware in the disk directory for ST373405FC Seagate 73GB 10K rpm drive and ST336605FC Seagate 36GB 10K rpm drive. This firmware fixes the following bugs: Bug ID 4472275 Fixes errant drive "NOT READY" status while servo is in recovery operation. Bug ID 4519031 Fixes merge code problem to prevent performance degradation on queued-sequential writes (known as the Slow Vol Init problem). Fixes Marvell register ESD partial-reset condition that causes corruption of the g-list area and format utility "drive-type unknown" condition. --------------------------------------------------- 109115-10 (1.18.00 Firmware) --------------------------------------------------- 4457184 1.17 Performance Drop 4407776 T3 not properly initialize disk paths after disk hot swap 4463349 Change the default setting of the ondg mode from passive to off 4452948 Modified Syslog Messages Get Wrong Serial # 4426801 Return value checking after HostPortFacToLun() call. 4460277 Executing ofdg -y health_check on T3B units causes system to be unavailable 4519958 Received data exception during bootup and controller reset. 4435227 Cache Should go to write-behind during battery refresh cycle 4432375 lpc/serial communication bug results in ctr hang 4376511 Purple doesnot generate lip when mp_support is changed. 4431121 format entry lingers after removing volume with 4455999 Need error handling when can't read midplane 4448563 Reboot/halt active node causes other node to panic w/ Reservation Conflict 4374724 Multiple Non-Adjacent Disk Failures in a RAID 1 stripe causes LUN to unmount. 4478381 T3B: Master encounters "Unfixable" file system error on alt. master. 4474779 Purple returns ASC 90 00 for path that is becoming ACTIVE 4480665 disabling a loopcard and then re-enabling the loopcard-loopcards disables again 4298510 Implementation of SCSI Read and Write Buffer commands. 4410516 cache stays write through even when forcing write behind 4473669 SCSI-3 testsuite causes T3 reset and test failures hait 4474321 TDL/PGR issue can prevent T3 from booting 4474325 commands fail on Purple II 4496053 tail without args resets controller 4411749 too deep a cd leads to instruction access exception and fs messages 4473002 - T3B: 1k data corruption 4488964 silently shut down. 4498487 Firmware reports wrong buffer size for SCSI read/write buffer commands. 4493006 T3 can not clear the key requested by a new PGR command REGISTER_AND_IGNOREKEY. 4470439 Partner Pair testing with 1.17b fails SCSI-3 Cluster TS reserve_r{1,2} test 4497511 PGR does not work correctly when connected via a python TL port. 4388855 Change to Controller firmware version format to dot-dot format 4430163 ISP2100 firmware v1.19.108 Target Mode Hang during MPXIO Unit Test 4495720 Purple 1 incorrectly implements EOFni handling 4414235 syslog priority in a T3 can't be changed 4500452 T3 slave controller PGR data base sometimes is not clean. 4383854 Write-behind cache disabled when RAID 5 disk fails 4504356 PGR: T3 needs to take PGR IN/OUT commands on its standby paths. 4483245 u1 hotplug fails to boot 4458768 line break handling is broken 4473389 1.17a Exception reset, warm cache startup code seems to modify cached data. 4505027 Need to support cross-domain builds 4486603 alternate master reboot while the master pull out 4486451 LUN becomes unavailable while doing DMP test. 4490884 Cache mirror failed after T3 Master ctr disabled (.probe) then enabled 4486568 Disable adjacent drives, followed by replacement and recon = bad data 4480200 Firmware should stop battery discharge at 6 minutes. 4412662 cleanup cmd_1 if cmd_2 cannot be allocated in qlcf_start_gauntlet() 4476044 Correct the residual count and ROR/RUR flags in FCP_STATUS for Inquiry cmd. 4449891 fix Clear function with cache stats for sys and vol HTTP tokens. 4457439 Bad System Area on disk is not fixed when possible during controller takeover 4502467 Testsuite misc_r{11} fails --- T3 - for Single brick. 4504882 hist command after long command line can crash system 4407897 1.17-T3B snmp sysDate and sysTime incorrect 4398786 Misleading token/SNMP information if loop card missing 4398788 When disk removed token/SNMP show "fault" 4396005 Misleading token/SNMP information if pcu missing 4351210 less than 256 mb cache detected should be warning not notice 4478907 Disabling of standby drive does not work correctly 4426026 sf2 sf_cmd_callback: Lost Frame (read) received 0x8 expected 0x1f4 target 0x1 4508212 T3WG and HP A5158 HBA locks up HPUX SAM utility 4385004 fw 1.16 long hostname gains copy of root password. 4511648 Prism debugger on purple-1 hit program exception with a break point set 4508802 SCSI r/w buffer cmds should ignore SCSI reservation 4459659 diag syslog messages need controller identification 4506732 Diskomizer stops with data compare error, when run with attenuator 4513285 While doing .probe test, ISP dumps 4434826 Dead T3 -config'ed w/ 2 LUNs, used as quorum panic'ed & killed Cluster 4506206 T3WG sets MultiP bit in standard SCSI inquiry data 4435279 Status for battery "shelf life expired" shows "battery low" 4436728 Data compare error, 2 block fails during Controller pull test 4418601 syslog states Please Send ISPDEBUGDUMP to Development Engineering 4520739 Existence of hidden 'psh' commands, should be removed 4475447 sys stat shows controller booting but role is reported as Altmaster 4521979 sub rev printing problem 4511149 cmdlog - keyboard buffer overflow 4509295 REGISTER operation fails on one initiator 4509300 T3 with PGR reservations active fails under IO load 4386434 Stale FCP port database(Loop Failure) 4524258 P1 and P2 return Success when the host tries to preemptandabort an unexistent Key. 4497687 sysdiag data compare failure is counted as passed 4509888 Bad drive causes vol verify to hang 4348580 two loop mode causes performance impact 4372821 fru stat now gets all drive temperatures --------------------------------------------------- 109115-09 (1.17b Firmware) --------------------------------------------------- 4448563 Halt of active node in cluster causes T3 reservation conflict (Note: 1.17b contains a single fix required for T3 partner pairs used in Sun Cluster or multi-initiator configurations using SCSI-2 reservations. Customers using 1.17a firmware in a non cluster/non multi-initiator configuration need not upgrade to this firmware release. For further details regarding supported Sun Cluster configurations with T3, please refer to the corresponding Sun Cluster product documentation as well as the 1.17b release notes (available in this patch). --------------------------------------------------- 109115-08 (1.17a Firmware + Patch Respin) --------------------------------------------------- Functionally equivalent to -07 release with the inclusion of previously released 1.16a and 1.16c firmware versions. --------------------------------------------------- 109115-07 (1.17a Firmware) --------------------------------------------------- Note: This firmware release superceeds all previous T3 firmware releases. Full support for SunCluster single brick configurations (formerly only supported with 1.16a firmware) are supported with this release. See the 1.17a release notes included in this patch for further details. 4286243 Need active channel bit from T300 4302763 Controller Temperature token data incorrect 4303429 Token for Volume WWN appears incorrect 4306672 failover issue with multi-initiator configuration 4342154 Assertion reset with DMP and expansion units 4354565 Queue full messages received for less than 64 in-transit IOs 4356418 Cybercop test causes controller data access exception 4365734 Power supply status from HTTP server doesn't match actual state 4366223 Some failed disk errors can start graceful system shutdown 4367372 fru stat reports ready for disk ports when loop cards are missing 4367622 boot -i w/telnet can hang due to lack of memory 4369314 http token problem if ctr status is "absent" or "disabled" 4372831 Cache mirroring needs to sync with volume recreation 4373406 Need ability to bind a host WWN to the initiator ID. 4373801 SNMP: mib out of sync with http tokens 4374059 SNMP: MIB Description Fields Are Out Of Date 4374280 Slow Failover w/ RAID 0 and VxVM Host Based Mirroring 4376303 IPI-3 to SCSI mapping incorrect causing Invalid Command Results 4377075 Offline host sometimes not detected 4377097 instruction access exception from t3 ftpd 4377279 Request Sense CDB should get Unit Attention Info in a Data Frame 4377484 Removing hot spare w/ one bad RAID 1 drive, offlines LUN 4377795 w/u2 as master pulling FC cables in DMP test causes probe break 4379816 Fan status doesn't update after fan fault and fru replacement 4381510 New ELF code causes ctr disable to not cause a failover 4382859 Attempting to disable a ctr using the CLI disable u1 fails 4384970 httpd gets an exception reset on long filenames 4385877 Request Sense CDB should get sense data returned in data fram 4386836 Drive firmware download can hang 4391664 Need to force T3 to be always be a PLDA private loop device 4393080 Port listmap is not updated properly after reboot 4394863 enabling a controller can take up to 16 minutes. 4397959 Extreme system load test can cause 128 byte shift 4398347 Host loses T3 paths when isp2100 chip reset was performed 4402518 disk replacement results in invalid system area 4403519 Uninitialized variable cleanup 4405603 internal disk utility changes needed for IBM disk support 4406749 Lun reconstruction can be slow with light IO from host 4406960 SCSI-3 Testsuite: tc_mhiocgrp_reserve{2} failed 4406971 SCSI-3 Testsuite: /tests_reboot/tc_misc/tc_misc{5} failed. 4406977 SCSI-3 Testsuite: /tests_reboot/tc_misc/tc_misc{11} failed. 4411125 Fatal drive timeouts w/Seagate Cheetah 4 disks causes slow I/O 4413938 boot -i command sometimes does not return 4414569 Excessive controller failovers can cause file system issues 4416530 "refresh -s" command shows inconsistent refresh date values 4419265 Alt master disables 4419795 Data miscompare results from bad ISP response 4420494 Halt on Alternate controller can cause system halt on master 4421372 New 1.17 PGR3 feature causes data consistency issue 4421398 Detected bad accumulator should stop boot cycle 4422437 Need to expand variables used to monitor luns during failover 4426777 Monitoring code needed to detect slowly degrading drives 4426848 System reboot in reconstruction can make the system unavailable 4427539 Accumulator Failures during POST don't get logged 4428961 w/host multi-pathing, INACTIVE to ACTIVE state changes cause hangs 4429293 Too Many "Lun is becoming active" syslog messages with mpxio 4429474 Battery syslog to provide better refresh cycle messaging 4429265 Battery scheduler file needs to revert to 28 days 4429830 tzset/date change during battery refresh causes false hold time 4431187 vol mode doesn't change to writebehind when cache/mirror=auto 4431302 local ftp client connection issues 4431621 Bad detected disks should be fenced off 4431999 Vicom Router (Tachyon) and T3 (Qlogic) incompatibility 4432631 Reconstruction does not start simultaneously with mp_support=rw 4434141 Default Value for ondg 1.17 must be 'passive' 4448563 Reboot/halt active node reservation conflict 4455315 Firmware 1.17a shows up incorrectly in ver output --------------------------------------------------- 109115-06 (1.16c Firmware) + Drive Firmware Update --------------------------------------------------- Seagate Cheetah 4 Drive Firmware Update --------------------------------- 109115-05 (1.16c Firmware) --------------------------------- 4419265: altmaster disables in production test 4403519: Uninitialized variables being used NOTE: This patch firmware release (1.16c) does not include changes made in 1.16a. For SunCluster 2.2 support, please continue to use patch 110760-01. The issues resolved in 1.16c only occur with partner pair configurations. As SunCluster 2.2 is currently supported with single T3 tray configurations (WG) only, the fixes contained in 1.16c are not relevant. In future patch releases, the fixes in patch 110760-01 will be integrated with patch 109115. --------------------------------- 109115-04 (1.16 Firmware) --------------------------------- 4345011 Failed disk drive can caused partner group to go offline 4345036 Assert Reset (3000):qlcf.c line 3289 Assert(ccb->cam_ccb_list 4363630 Probe break in qlcf.c due to unexpected ISP chip status 4337653 Enabling a disabled controller through CM sometimes fails 4348500 panic on cluster master results in reservation conflict on T3 4353469 battery refresh stuck in a discharge loop 4361866 during drive hot plug test , dex io errors and probe break 4296017 cache stays at writethrough after controller/loopcard failover 4339555 ofdg run with a bad u2l1 causes T300 reboot loop 4341193 Upgrade/Downgrade lpc fr/to 4.13 & 5.01 caused amber LED 4342142 EU: Multiple drives bypassed when using .probe in fail-over t 4346447 Tokens don't provide accurate status after a ctr enable 4350267 Disk is being bypassed during volume initialization 4353101 rack configuration fails FC cable to hub failure test 4353660 MADRW errors because T3 is busy handling drive disable 4353667 Interrupted more command results in performance loss 4353731 Incorrect value in LPC 5.01 fw download disables T3 4359814 Multi drive hot plugs cause ISP chip dumping data 4363780 Target Reset Not Issued To Clear Reservations 4322151 Missing controller doesn't start overtemp shutdown timer 4338553 Incorrect display of lpc SN's in fru list 4351027 tftpboot of nb113.bin on systems w/new drives drops to ADMIN 4352277 Battery refresh does not properly maintain interval specified 4371858 README.109115-03 states t3.sh prompts with T3 login prompt 4371912 README.109115-03 identifies wrong ftp step 4371926 README.109115-03 does not identify how to determine disk drive type 4371933 README.109115-03 should tell user to remount volumes 4383266 t3.sh script fails if executed from directory other than "." --------------------------------- 109115-03 (1.14 Firmware + Disk Drive Firmware) --------------------------------- T3 Disk Drive Firmware included in patch. --------------------------------- 109115-02 (1.14 Firmware) --------------------------------- 4344316 Stray IOCB's cause probe break 4350265 Back-end cache mirroring enable problem 4346571 EU: Probe break on controller exhausted mbox commands 4326248 t3.sh patch installation script overwrites .netrc 4290158 Reconstruction doesn't start after boot with a failed disk 4304266 No drive reconstruct if drive fails when volume is unmounted 4306056 More robust DRAM parity error reporting/handling 4302850 Cluster: rsv100.bin fails scsi2 reservation test suite 4309906 volume create/delete/mount/unmount POST operation returns early 4309901 CM unable to detect or notify failed volume initializations 4311688 Drives are offlined when unmount and recreate a LUN with one lun 4317086 Separating a partner group will cause single unit boot cycle 4306345 sysAutoDisable token doesn't give enough info about autodisable 4292138 System needs manual loop failure isolation facility 4326147 multi-controller disable during heavy load/stress test 4280101 ISP2100 f/w panic during PCI/DMA errors 4305281 T3 trying to use AL_PA 0x01, can cause config problems 4262121 Need http token for ondg output 4290161 Excessive LIPs can cause SCSI 'Report Lun' issues 4290677 ISP2100:Outgoing Mailbox 0 register reports 0x8002 (sys err) 4293211 T3 should immediately execute scsi "report lun" & inquiry 4293252 Qlogic firmware LIP'ing prematurely 4297464 Master Controller disable during drive hot plug simulation 4311922 T3 syslog date stamp often off by timezone offset 4313092 Controller failover test needs a second reseat to boot 4317148 elemprop.htm page shows wrong values for fruPowerBatLife 4326190 Miscellaneous sys command changes 4326920 running ofdg 'health_check' with u1l1 disabled resets T3 4282275 Some system default settings need to be removed 4291723 Fan failure injection Doesn't Correctly Initiate LED 4307139 Some syslog entries to be moved from warning to notice 4308583 Back-end loop failure test causes stale FC-AL port 4317596 Copyright login banner should be updated 4335062 Loop 1 split should not be allowed in Non-HPC enviroments 4323931 SGI:T3 locks up in a hang status until SYNC cmd issue 4329797 http traffic during ofdg -y fast_find causes probe break 4331699 T3 with one lun can Qfull and stop all i/o for 60 seconds 4335848 Drives being spun down by controller firmware 4329876 infinite loop after restart from TDL probe break 4331817 token for controller temperature should be celsius 4260918 A bad interconnect cable causes repeated LIPs 4299474 T3 POST should be able to detect cache memory parity errors 4325782 Intercept all exception vectors for CPU 4331682 OFDG Abort tokens should be removed 4332462 Probe break after reset -y, following boot -i file 4335250 Drive component failure caused probe break during ofdg run 4323937 Reservation not cleared with SCSI Reset 4335070 CM2.1 after "Disable" of u2ctr, "Disable" goes inactive 4324402 CM showed wrong loop card as disabled 4332909 Upgrade from 0.95 to 1.00 results in probe break 4333385 Excessive characters to passwd command causes probe break 4333439 File system check got a verify volume failure 4333611 Double disk pull (w/standby) on RAID 1/5 causes boot loop 4334609 fru stat loop cable display is reversed 4328814 battery/pcu LED should go amber on battery hold test failures 4335870 New HPC Vendor ID breaks T3 back-end cache mirroring 4336121 firmware shuts off all token access during OFDG tests 4334945 ONDG error token indicates prematurely testing is complete 4336123 CM2.0 and T3 FW1.11 result in number format exception 4336487 Loop card firmware 5.01 doesn't show the vendor/model fields 4333578 Progress indicator does not work for OFDG 4311701 OEM:With cache mirroring enabled need to support write caching 4317995 T3 volumes got unmounted during heavy load/stress 4330350 ofdg uses inconsistent loop card device names 4332202 When in writebehind cache mode, enabling failed controller fails 4334419 Battery Scheduler File To Be Moved to 14 Days 4328000 sys command option 'vendor' is available through useage 4338106 OFDG diag status never updates to show it completed 4337620 Controller Firmware Should reset hung Qlogic Chips 4342258 Patch readme to install firmware incorrect -------------------------------------- 109115-01 -------------------------------------- (1.01a RR Firmware Release) Patch Pre-Install Instruction: ------------------------------ 1) ftp the 'syslog' or the file to which the system log is directed from required T3A on which patch 109115-17 is planned to be installed. 2) Keep this 'syslog' file in a local directory on host system and run following command. egrep -i '0x5D|Threshold|0x15|0x4|Mechanical|Positioning|Exceeded|Disk Error' syslog If you see any of following error messages then take appropriate action of backing up of data from the volume, replace the drive reporting any of these errors, ensure the volume is in optimal working state without any drives disabled and then Install the patch. An Example: Here 'u2d5' and 'u1d3' shows the location of drives. test_host% egrep -i '0x5D|Threshold|0x15|0x4|Mechanical|Positioning|Exceeded|Disk Error' syslog Jun 05 06:16:14 ISR1[2]: W: u2d5 SCSI Disk Error Occurred (path = 0x0) Jun 05 06:16:14 ISR1[2]: W: Sense Key = 0x4, Asc = 0x15, Ascq = 0x1 Jun 05 06:16:14 ISR1[2]: W: Sense Data Description = Mechanical Positioning Error Jul 31 16:19:22 ISR1[1]: N: u1d3 SCSI Disk Error Occurred (path = 0x1) Jul 31 16:19:22 ISR1[1]: N: Sense Key = 0x1, Asc = 0x5d, Ascq = 0x0 Jul 31 16:19:22 ISR1[1]: N: Sense Data Description = Failure Prediction Threshold Exceeded Patch Installation Instructions: -------------------------------- This patch includes a firmware uploading utility (t3.sh) that simplifies transferring the contents of this patch to a StorEdge T3 system. This script is intended for use on Solaris host systems only. To manually install the contents of this patch see the pertinent section below. *** Warning *** Warning *** Warning *** Warning *** Warning *** BEFORE attempting to load firmware on a StorEdge T3 system, be sure to stop all IO activity from all attached host systems. This procedure requires a T3 system reboot so all necessary host preparations needed to sustain this procedure should be made before starting. It is recommended all T3 volumes be unmounted on Solaris before proceeding with this patch installation. Note: To verify the current firmware version running on a target T3 system, use the 'ver' command at the T3 command line as follows: t3:/:<3>ver T300 Release 1.14 2000/07/12 19:22:50 (192.168.209.123) Copyright (C) 1997-2000 Sun Microsystems, Inc. All Rights Reserved. To update the T3 system with the entire contents of this patch follow the steps below after quiescing IOs from the host: 1. System Preparation Once the patch has been downloaded to a Solaris host, extract (if necessary) the contents of the patch to a temporary working directory. 2. Verify the T3 system to be upgraded is reachable on the network: $ ping t3 t3 is alive 3. Verify the T3 system has a root password (the 't3.sh' patch utility uses ftp to transfer the files to the T3 which requires a root password): $ telnet t3 Trying 129.150.47.115... Connected to t3. Escape character is '^]'. pSOSystem (129.150.47.115) Login: root Password: <---- Must type password here T300 Release 1.14 2000/07/12 19:22:50 (192.168.209.123) Copyright (C) 1997-2000 Sun Microsystems, Inc. All Rights Reserved. t3:/:<1> If no root password is set on the system, be sure to set one by logging into the T3 system and using the 'passwd' command. 4. Transfer patch contents to T3 system. Note: There is limited space available in the T3's reserved system area. Therefore, it is important to be sure there is adequate space on the T3 before proceeding with the procedure to ftp firmware images to the unit. It is not necessary to keep old images of controller firmware, unit interconnect card firmware, or eprom binaries on the T3 once those images have been loaded per the instructions provided in this readme. The recommended way to install the contents of this patch on T3 systems is to use the included t3.sh script from a Solaris host that has network access to the target T3 system being upgraded. This utility will transfer the required files in this patch to the target system depositing the files in the correct directories. If a Solaris host isn't available, the contents of this patch can be manually uploaded to the target T3 system using the following method. Note: It is not necessary to transfer all files contained in this patch to a target T3 system. For example, the docs subdirectory in the patch provides reference documentation and is not required by a T3 system. Manual File Installation ------------------------ -> Extract the contents of the 'files.tar' image to a temporary working directory: -> Manually ftp the following files contained in this patch to the corresponding directory on the T3 system: Patch Source Location T3 Destination ------------------------- ------------------------ ./ep2_10.bin /ep2_10.bin ./lpc_05.02 /lpc_05.02 ./v1184.bin /v1184.bin (From temporary Working Directory) ./etc/schd.conf /etc/schd.conf ./web/*.htm /web ./web/snmp/t300.mib /web/snmp/t300.mib Automated File Installation --------------------------- Note: The 't3.sh' installs all controller and system files but does not overwrite /etc/hosts and /etc/syslog.conf files on the target T3 system as these files are typically customized per local operating environment requirements. One exception to this is the battery refresh scheduler file on the T3 (/etc/schd.conf). The t3.sh script will make a backup copy of this file on the T3 (to /etc/sch_old.conf) before copying over the new /etc/schd.conf file. Factory default versions of these files do exist in the accompanying files.tar image in this patch, should they be required however. To start the installation script, first verify the target T3 system can be reached through the local network (use ping to verify the target T3 system is reachable). Once this has been confirmed, the installation script can be started as follows: ./t3.sh (Note: Be sure the t3.sh script is executable.) The t3.sh utility will prompt for information as follows: Please Enter Hostname or IP Address Of T3 To Be Ugpraded: -> Enter Hostname or IP address of the target T3 system. Please Enter Patch Location Pathname [.]: -> Enter path to where the files.tar image exists. Typically the default current working directory is sufficient (hitting enter will accept the current working directory). Please Enter Your Home Directory Path [/home/joe_user]: -> The home directory of the user ID used when launching the t3.sh script (this is used by ftp to automatically load the patch file contents to the T3). At this point, the automatic ftp login/upload process should begin. The user will be prompted with a ftp login prompt at which point the user should respond using the default T3 root login and password. 5. Load new Unit Interconnect Card Firmware: Using 'lpc version' on the T3, if the loop card firmware is running downrev firmware, upgrade all loop cards in the partner group as follows from the T3 command line: :/:<1>lpc version LOOP A LOOP B Enclosure 1 5.01 Flash 5.01 Flash Enclosure 2 5.01 Flash 5.01 Flash :/:<2>lpc download u1l1 lpc_05.02 Repeat the above steps for all units in the partner group (ie u2l1 and u2l2). :/:<3>lpc download u1l2 lpc_05.02 :/:<4>lpc download u2l1 lpc_05.02 :/:<5>lpc download u2l2 lpc_05.02 Note: It is possible to string t3 commands together using the semicolon command. This allows a single command line session to launch several commands without waiting for each command to complete. To upgrade both loop cards one controller for example, one could type the following: :/:<2>lpc download u1l1 lpc_05.02; lpc download u1l2 lpc_05.02 From the command line, verify the correct unit interconnect card versions are loaded as follows: :/:<6>lpc version LOOP A LOOP B Enclosure 1 5.02 Flash 5.02 Flash Enclosure 2 5.02 Flash 5.02 Flash 6. Boot the T3 controller boot code. From the T3 command line, type the following to install the boot code: :/:<7>boot -i v1184.bin file header: size 274ad4, checksum 57345395, start 20010, base 20000 (Caution: be sure all IOs have been quiesced and no host IO activity is scheduled to start until the upgrade procedure has completed) 7. Verify And Download New EPROM Code If Necessary. Note: This step can be skipped if the controller revision string displayed by the 'fru list' command matches the EPROM binary image included in this patch. The following example shows 'fru list' output for a controller running 2.10 level EPROM code which matches an 'ep2_10.bin' EPROM binary: :/:<10>fru list u1c1 ID TYPE VENDOR MODEL REVISION SERIAL ------ ----------------- -------- ----------- ---------- ------ u1ctr controller card SCI-SJ 375-0084-01- 0210/011803 000689 To download new EPROM code, from the T3 command line, type the following: :/:<8>ep download ep2_10.bin Done with writing EPROM code of controller 1 Start writing EPROM code of controller 2 Done with writing EPROM code of controller 2 8. Verify system boot mode set to auto: From the T3 command line, type the following: :/:<9>set bootmode auto 9. Reset the T3 System as follows: :/:<10>reset Reset the system, are you sure? [N]: y 10. Once the system has booted successfully, log into the system and verify the boot code is properly loaded using the 'ver' command as outlined earlier. Output should be similar to the following: T300 Release 1.18.02 2003/02/28 11:30:28 (10.4.57.53) Copyright (C) 1997-2001 Sun Microsystems, Inc. All Rights Reserved. 11. type "port list" and verify ports :/:<1>port list port targetid addr_type status host wwn u1p1 1 hard online sun 50020f2300000f61 u2p1 2 hard online sun 50020f230000297d 12. Verify volumes are visible and mounted. From the command line, enter the following: :/:<2>vol list volume capacity raid data standby v0 71.6 GB 1 u1d1-8 u1d9 13. Congratulations, the upgraded T3 is ready to use again. Drive Firmware Upgrade Instructions ------------------------------------ Note: A disk drive firmware upgrade may not be necessary. To verify the most up to date drive firmware versions are installed, run the 'fru list' command on the StorEdge T3 system and compare the results of that output with the information found in the README.disk file (see the disk subdirectory of this patch). During a disk drive firmware download, the functionality of the disk tray is limited. To avoid system problems verify: o A current backup copy of the data on the T3 exists. o The data path between the T3 and the host has been quiesced. There must not be any IO activity during the disk drive firmware download. o The ethernet connection to the T3 is not being used for any other operation during this procedure. If Component Manager is being used to monitor the T3, automatic polling must be disabled. Refer to the Component Manager Users Guide for instructions to disable T3 polling. o No unnecessary command line program interaction with the T3 system is performed during disk drive firmware downloads. Note: The disk firmware download will take approximately 20 mins for 9 drives. Do not attempt to interrupt the download or perform other command line functions during the process. The command prompt will return after the download process has completed. Disk Firmware Upgrade Instructions ------------------------------------ 1. Using ftp, transfer the appropriate disk drive firmware to the T3 root directory from the disk/ subdirectory contained in this patch. Be sure the file is transferred in binary mode. Note: The T3 system limits the filename length of files being transferred to the local disks. Be sure the file name is 12 characters or less in size and that the file name starts with an alphabetic character (not a numeric). It is recommended the file names provided in the disk/ subdirectory not be changed. 2. Establish a telnet connection with the T3 (see T3 product documentation for specific details if necessary). Log into the system as 'root'. 3. Verify all T3 disks are in an optimal state as follows: -> Confirm all disks are ready and enabled using the T3 'fru stat' command. -> Confirm all disks configured into volumes are in an optimal state using the 'vol stat' command. All drives should report a drive state of zero. If there are drive issues reported, correct these problems before proceeding with the disk drive firmware download procedure. 4. Verify no volume operations are in progress using the 'proc list' command. If a volume operation is in progress, this operation must be allowed to complete before proceeding. 5. Verify no battery refresh operations are in progress using the 'refresh -s' command. If a battery refresh is in progress, it is recommended the refresh operation be allowed to complete before proceeding with disk drive firmware downloads. 6. Unmount T3 volumes To ensure no host IOs are active, unmount all T3 volumes from the host system. In addition, it is recommended the internal T3 volumes be unmounted as follows (using volume v0 as an example) :/:<1>vol unmount v0 7. Install the drive firmware using the T3 'disk download' command as follows (this example assumes a download is being performed on drives on a master tray. Substitute FILENAME with the file name of the actual disk drive firmware image ftp'd to the tray in step 1): :/:<2>disk download u1d1-9 FILENAME Note: -> In a partner group, the disk download command can only specify one set of 9 drives at a time. -> All drive types specified on the command line *must* be of the same drive type. If individual drives require different firmware versions, multiple invocations of the 'disk download' command must be used to download firmware. -> If the wrong firmware type is specified for a given drive, the disk drive will reject the erroneous file download request and revert back to the disk firmware that was running at the time of the download request. -> It is possible to invoke multiple calls to the download utility by separating the commands with a semicolon as in the following example: disk download u1d1-9 FILENAME; disk download u2d1-9 FILENAME 8. Verify the drive firmware download was successful using the T3 command 'fru list'. 9. Reboot the Sun StorEdge T3 array after all drives have been upgraded. The T3 system can be rebooted using the T3 'reset' command. Note: In some cases after a drive firmware download, older firmware version strings may still display in the 'fru list' command. A reset of the T3 after the download ensures the version information is updated correctly in internal T3 tables. 10. Once the tray has come back online, log into the array and verify optimal FRU states by doing the following: -> Confirm all disks are ready and enabled using the T3 'fru stat' command. -> Confirm all disks correctly report model number and new firmware version information correctly using the T3 'fru list' command. -> Confirm all disks configured into volumes are in an optimal state using the 'vol stat' command. All drives configured in volumes should report a drive state of zero. 11. Remount the unmounted volumes using the T3 'vol mount' command. In addition, on all attached host systems, remount any T3 volumes that were unmounted in step 6. --------------- Appendix A --------------- The following sections describe major functionality changes in this 1.18.03 firmware level. ---------------------------------------------------------------- 1. Predictive Failure Error (or SCSI code 1/5d/xx) Handling Changes ---------------------------------------------------------------- When a disk reaches its error threshold, it will report a SCSI error 1/5d/xx to the T3 controller, the T3 syslog will contain the following messages for the disk, by using u1d4 as an example: Aug 06 15:22:06 ISR1[1]: N: u1d4 SCSI Disk Error Occurred (path = 0x0) Aug 06 15:22:06 ISR1[1]: N: Sense Key = 0x1, Asc = 0x5d, Ascq = 0xff Aug 06 15:22:06 ISR1[1]: N: Sense Data Description = Failure Prediction Threshold Exceeded (FALSE) Aug 06 15:22:06 ISR1[1]: N: u1d4 SVD_CHECK_ERROR: prediction err: 01/5D When the T3 controller receives this error, if there is a spare disk assigned to the volume where the disk belongs to, and if the volume/slice is a RAID 1 or 5, T3 will automatically start a background volume reconstruction of the failing disk to the spare disk. The T3 syslog will show messages like: Aug 06 15:22:06 ISR1[1]: W: u1d4 disk will fail soon. Aug 06 15:22:06 ISR1[1]: N: u1d4 vol recon is going to start if autorecon is on. Aug 06 15:22:06 ISR1[1]: N: u1d4 please replace this disk after it has been subst'd. Note: "autorecon" is an internal default setting and it should be set to "on" in all cases. When the volume reconstruction starts, the following message will be recorded in the T3 syslog: Aug 06 15:22:11 LT00[1]: N: u1d4 Copy drive to standby disk started When the volume reconstruction ends, the following message will be recorded in the T3 syslog file: Aug 08 00:03:27 LT00[1]: N: u1d4 Copy drive to standby drive completed When this reconstruction process ends successfully, the failing disk will be marked as "substituted" and user can replace it. If the LUN is not RAID 1 or 5, or if no spare disk is available for RAID 1 or 5 LUN, or if the reconstruction fails, no action will be taken by T3, and the reporting disk will be left "operational" as it is. --------------------------------------------- 2. Disk firmware download functionality changes. --------------------------------------------- Disk firmware download process is made more robust. The user interface and functionality remains same except for following case. When multiple disk firmware download command is issued, Example: disk download u1d1-9 'firmware_file' If one of the disks in the range fails, the disk download operation terminates immediately and does not attempt to download firmware on remaining disks. This is done to avoid multiple disk failures. --------------- Appendix B --------------- The following section describes useful tips. ---------------------------------------------------------------- 1. How to start and stop automatic device monitoring under StorADE. ---------------------------------------------------------------- Using the 'ras_admin' command, one can stop and start the StorADE cron job that monitors all devices. The command syntax is: /opt/SUNWstade/bin/ras_admin stop/start_cron Also, using the 'ras_admin' command, one can view the polling activity status of each devices defined in StorADE. The command syntax is: /opt/SUNWstade/bin/ras_admin device_list The output of this command contains a column labeled 'Active', with 'Y' denoting active monitoring and 'N' for in-active monitoring. For more details about StorADE, please refer to the StorADE User's Guide documentation. --------------- Appendix C --------------- About T3extender.tar.Z T3 units were originally designed with a battery expiration/ obsolescence strategy to protect customers data integrity in the event of battery failure. Historical and installed base data has established that the original 24 month battery expiration did not provide an adequate service life for the battery/PCU units. The result was unnecessary replacement of PCU's and batteries. This premature replacement increased outage opportunities that could be caused by inadvertent damage to other internal parts and/or reinitializing the unit during battery/PCU replacements. A file T3extender.tar.Z is included with this T3 patch. T3extender.tar.Z contain a simple perl script that extends the battery life span from 24 to 36 (1095 Days 18 Hours) months via their Ethernet connection. To install and execute the program, perform the following steps: 1. # zcat T3extender.tar.Z | tar xf - 2. # cd T3extender There are four files that under the T3extender directory, which are I. batxtender -- script that extends the battery life span. II. perlx -- perl library III. t3hosts.example -- syntax example of editing the t3hosts file IV. README -- detailed procedure on how to run the script, please read this before you run the script 3. (optional) edit the t3hosts file (see t3hosts.example) 4. # ./batxtender Special Install Instructions: ------------------------------------------- If you plan to run several automated file installation scripts (t3.sh) the same time on one host to update many T3A systems, you must enter different "Home Directory Path" since t3.sh script will use $HOME/.netrc for ftp. Please make sure that you have full permission in those directories. README -- Last modified date: Wednesday, September 22, 2004