Patch-ID# 112068-02 Keywords: dlt, dlt8000, dlt 8000, quantum, tape drive, firmware, storedge l20 Synopsis: Hardware, Quantum DLT8000 Tape Drive, Firmware Download Program, V88 Date: Apr/29/2004 Install Requirements: Additional instructions may be listed below Solaris Release: SunOS Release: Unbundled Product: Hardware/Tape Unbundled Release: N/A Xref: Topic: Relevant Architectures: BugId's fixed with this patch: 4659443 Changes incorporated in this version: 4659443 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: DLT8000_V88_oml_3.img README.112068-02 tload Problem Description: NOTE: A highly intermittant error is reported by VERITAS NetBackup during backup with the DLT8000 tape drive installed in Sun StorEdge L20 tape library. The application will report 'error 174' with the associated message, '1 byte written'. The drive reports the incorrect bytes written. In addition, there are some changes to the code which are anticipated to make an insignificant improvement in AFR and reliability. Patch Installation Instructions: -------------------------------- None. Special Install Instructions: ----------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Contents -------- A.0 Firmware File Names & Utility Descriptions B.O Precautionary Statements C.O Patch Installation and Utility Usage Instructions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A.0 Firmware File Names & Utility Descriptions ----------------------------------------------- A.1 DLT8000_V88_oml_3.img (1,187,840 bytes) --> The firmware binary image V88 (Rev 0258) A.2 tload (57,176 bytes) --> The firmware download utility A.3 README.112068-02 --> This file A.4 Firmware Changes Release Notes for revision changes from V59 to V87 =================================================== Last Release Revision: V59 Release Revision V60 A4.1 Problem Description: The 7000 emulation mode feature enabled through library command or SCSI Mode Select EEROM page may be reset when Code Update is performed. The emulation mode is retained over power cycles and resets. Root Cause: The EEROM value used to select the 7000 Emulation mode is not preserved across Code Updates. Corrective Action: Code changed to preserved the 7000 Emulation mode EEROM setting when code updating to V54 and future releases. A4.2 Problem Description: The tape drive could hang or go unexpectedly SCSI bus free (C800 bugcheck) if a SCSI Locate command was issued after hard read error. Root Cause: The internal position could be lost after a hard read error. This causes the drive to hang without moving tape, or go bus free. Corrective Action: Insure that we have a valid position after a hard read error. A4.3 Problem Description: If a write-protected tape is unloaded from the drive, the WP (Write Protect) bit in Mode Sense Data Header will continue to be set. Root Cause: After unloading a tape, the internal write protect flag was not updated. Corrective Action: Clear the internal write protect flag after a tape is unloaded. A4.4 Problem Description: The drive does not return to normal operation, following a CUP initiation by use of the front Unload button in which no cartridge is loaded into the drive prior to the drive timing out. Root Cause: After the Unload button is pushed to initiate the CUP process via a tape, the servo waits 60 seconds for a tape to be inserted after which the drive should default back to its normal operating mode. It looks like we have gone back to normal mode, but if you then insert a tape the drive will try to perform the CUP process. The controller code is not recognizing that the CUP timeout has occurred. Corrective Action: Added checking for the case that a CUP is being aborted, and if so,restore the drive to normal operating mode. A4.5 Problem Description: The drive reports the cartridge has been ejected but then for a very brief period reports that a cartridge is in the drive. The failure occurs during the ejection of a cartridge. This behavior was first introduced in code revision T60-1. Root Cause: The miss reporting of the drive status was introduced in the implementation of a feature specific to one drive personality. Corrective Action: The implementation has been changed to only exhibit this behavior when the firmware personality is equal to the personality requiring the specific feature. A4.6 Problem Description: The drive may take excessively long to perform a SPACE/LOCATE command. Commanding the drive to position itself at the first block on a track after the BOT hole is required to trigger this behavior. For the specific case of positioning to block 0 (the first block on tape and track), the use of a SPACE or LOCATE command may result in this behavior. The use of a REWIND command will not cause this behavior. Root Cause: Stopping and starting of tape may result in a small amount of tape slippage at the tachometer roller. This slippage creates an error in actual location on tape relative to computed location on tape. As this slippage is cumulative, the larger the number of start/stops that are performed causes an increase in the amount of overall location error. For the case of a SPACE or LOCATE command to first block on a track after the BOT hole, the code does not expect to encounter the BOT hole. As the positioning error increases the potential increases to encounter the BOT hole when locating to this block. If the BOT hole is encountered, the current working tape directory is invalidated, causing future SPACE or LOCATE commands to no longer have quick positioning times. Corrective Action: Internally the firmware now uses the BOT hole to position the head to the first block on a track after the BOT hole, when a SPACE or LOCATE command to that block is received. This is how the REWIND command is implemented. This has the effect of reinitializing the tape position counter every time BOT is encountered, and eliminates the invalidation of the tape directory. A4.7 Problem Description: There have been rare occurrences of Hall Switch drive errors (event log A402 with error code 00AB Hex) during library qualification testing. The error occurs during the tape loading process i.e. within seconds of inserting a cartridge. Root Cause: The error is caused by a "lockup" condition on one of the internal features of the Intel micro-controller. The servo firmware uses a special function provided by the micro-controller to reduce the processing overhead associated with handling the hall switch transitions being produced by the supply and take-up reel motors. The special function can lockup, i.e. ignore incoming hall switches, if very specific conditions exist. An errata sheet provided by Intel very specifically defines these conditions. During load/ unload operations there is a small window of opportunity to enter this lockup condition. The failure data indicates that the "lockup" condition occurs on the take-up reel motor. When loading a tape, the firmware uses the hall switch transitions from the take-up and supply motors to control speed. With the takeup motor hall switch transitions being ignored the firmware compensates by making the tape go erroneously fast and eventually reports the Hall Switch drive error (00AB Hex). The error is reporting that supply reel hall switch transitions are occurring at a dangerously high rate. Corrective Action: The "lockup" condition is caused by overflowing holding buffers within the micro-controller. These holding buffers are typically emptied by the process of servicing interrupts associated with motor hall switch transitions. The overflow canonly happen when servicing of hall switch interrupts has been disabled. The interrupts were being disabled at specific times during the load/unload process. The disabling of motor hall switch interrupts is not necessary and provides the opportunity for the overflow/ lockup condition. The firmware has been changed to leave the hall switch interrupts enabled at all times as a method of preventing overrun conditions. In addition, a specific register of the micro-controller is accessed prior to attempting a cartridge load operation. Accessing the register will clear an overflow condition. A4.8 Enhancement: An additional command was added to the DLT Interactive Library Interface. This command (Clear File Protect, 0x19) gives the library the capability via the library port to modify the Write Protect status of the drive. Issuing this command will cause bit 7 of byte 3 in the Tape Data Packet 3 to be cleared, the write protect light on the drive bezel to be turned off, and will effect the functionality of the PROTECTDIRONWP mode select parameter. The only valid time this command can be issued is when no tape is loaded in the drive. If the command is issued at any other time, it will be ignored. A4.9 Enhancement: A positive feedback mechanism was added to the library interface that would allow the library to monitor if the drive has engaged the cartridge hub via a new status bit, Hub Locked (bit 4 of byte 7 in the General Status Packet). When this bit is set, it indicates that a positive engagement has occurred. The Hub Locked status is determined prior to determining the Cartridge Rejected status. A4.10 Problem Description: Library Tape Data Packet 4 (Media ID) could display the wrong media ID. Root Cause: The media ID was displayed as zero when a cartridge was removed, but would remain the same if a different cartridge was inserted until the new cartridge completed calibration. Corrective Action: Zero the media ID on cartridge ejection and keep it zero until the new cartridge has completed calibration. A4.11 Enhancement: Commands implemented to return error information over the library port. Command 24h will return write error packet information, and command 25h will return read error packet information. The information returned will be equivalent to the cumulative values of Log Sense pages 2 and 3, parameters 02, 03, 05, and 06. For a more detailed description refer to the DLT(TM) Interactive Library Interface Spec. A4.12 Enhancement: Remove the Cartridge Reject Retry feature. Added the EEROM_B_CARTREJRETRY parameter to enable cartridge reject retries. This will modify bit 05 of the Drive Options command enabling the servo to perform cartridge reject retries or not. The parameter is defaulted to true for personalities CPA-2 and CPL-1 and set to false for all other personalities. Release Revision V80 A4.13 Problem Description: There exists the potential for the drive to report an error code 29 in the logged CA01 event log during a code update. Additionally the Right side bezel LEDs may not be sequencing correctly under this condition. This is purely a case in which the drive was incorrectly logging an error condition that did not exist. The code update still succeeded even with this error logged. Root Cause: This is caused by the drive (servo) failing to respond to communication after it has done its CUP and responded to initialization. The drive fails to respond whilerewinding the tape and the policy code immediately gave up waiting for the rewind and tried to send the flash LED string, which failed also. Corrective Action: The servo code will now respond to the policy communication following a code load operation, preventing this error from being logged. A4.14 Problem Description: The drive would log a Bug check BF01 (Unknown DAM (Drive Attention Message)) and go bus free when the servo code sent a servo linear accumulation error (DAM34) to the policy code. The issue is that the drive is bug checking when it should be reporting a read or write failure via SCSI. Root Cause: The policy code did not know how to handle a servo linear accumulation error. Corrective Action: Now the drive will report a Hard Write Error (SK: 03H ASC: 0CH ASCQ: 00H) or Hard Read Error (SK: 03H ASC: 11H ASCQ: 00H) under the servo linear accumulation error. A4.15 Problem Description: When encountering a Hard Write Error during a write operation, the drive does not report the correct residual count to the host. Root Cause: Bytes/blocks that were transferred to the drive from the host, but had not yet made it to tape, were not accounted for when calculating the residual count. Corrective Action: The drive now keeps a running count of the number of bytes, filemarks, and objects in cache to accurately report the correct residual count to the host. A4.16 Problem Description: Drive reports an E00A bug check during read or write operations. Specific servo errors immediately after initialization can cause this bugcheck if the number of tape motion hours since last cleaning is zero. Root Cause: Multiple event logging routines were vying for the same memory locations and causing the drive to report an E00A bug check. Corrective Action: Modify event logging to ensure that shared memory access is correctly arbitrated. A4.17 Problem Description: If a SPACE sequential file mark command was issued to the drive with a value of greater then 2 file marks when a blank tape is loaded the drive would report an invalid CDB error. Root Cause: In one particular circumstance, blank tape was being incorrectly recognized as a tape written in a legacy (pre-DLT2000) format. Corrective Action: The drive will always correctly report a blank check after a SPACE sequential file mark command is received and the tape is blank. A4.18 Enhancement: Provide enhancements to the SCSI FIFO CRC checking and reporting mechanisms: Enable SCSI FIFO CRC checking on write commands. The SCSI check condition reported if a FIFO CRC error is detected on a write command is (SK: 04H, ASC: 0CH, ASCQ: 80H) Write SCSI FIFO CRC Error. The drive will persist in returning the Write SCSI FIFO CRC Error check condition on all subsequent write commands. A power cycle is required to attempt further write commands. Change the reporting method if a FIFO CRC error is detected during a read command from a 9049 bug check to a SCSI check condition with sense information of (SK: 04H, ASC: 11H, ASCQ: 80H) Read SCSI FIFO CRC Error. The drive will allow the application to re-read data after encountering a Read CRC error (i.e., the check condition is not persistent). A4.19 Problem Description: When doing ISV backup application regression testing with T80-6 code, a drive hang condition could occur when the host requested sense data that would have indicated the SCSI FIFO CRC error. Root Cause: Transfer of sense data to the host was held up waiting for the SCSI FIFO CRC error processing to complete. A timing window sometimes allowed the firmware to lose a control message, so the error processing would not complete and the drive would hang. Corrective Action: The SCSI FIFO error sequencing was corrected so that it can never lose the control messages. The rest of the system can then proceed normally with SCSI command handling. A4.20 Enhancement: Calculation of backward track offset can be made more accurate by using opposite azimuth amplitude as offset rather than simply using amplifier DC offset. This will lessen the effect of wide/narrow track conditions of drives. This is proven to reduce potential recoverable read/write errors and has been proven using Ongoing Reliability Testing to show more consistent track widths of drives using multiple tapes. A4.21 Enhancement: Change the cleaning light behavior during calibration retries. In the event that the drive experiences difficulties while calibrating to a tape, the firmware will attempt up to two additional calibration retries prior to reporting a hard calibration failure via a cleaning required (SK: 03, ASC: 80H, ASCQ: 01H). Previously the cleaning light would be turned on by the firmware if any one of the calibration attempts fails even if additional retries exist. If the subsequent calibration attempt succeeded the cleaning light would be turned off. This behavior could cause confusion to users as they could see the cleaning light on for an extended time and then the cleaning light would go out. The new behavior is to only turn on the cleaning light if all calibration retry attempts fail. (There will be a short (<2 sec) flash of the cleaning light between retry attempts.) There are no changes to the actual calibration routines. The only change is to turn off the cleaning light at the start of any calibration attempt. A4.22 Problem Description: The labeling process for a blank tape significantly increased in time with the introduction of V52 firmware. Root Cause: The majority of the time difference is due to the introduction of a head buffing feature that was implemented with V52 firmware. This head buffing operation was always occurring on the first calibration of a blank tape. Corrective Action: Do a head buffing operation only when the calibration fails. For the DLT8000ECN never do a head buffing operation. This will keep the normal case performance at an acceptable level for all drives while preserving the head buffing functionality for the error cases A4.23 Problem Description: Drive reports load command needed after a drive cartridge reject status. Currently the requested load command will cause the drive to hang. This condition occurs when a handle is manually altered (raised or lowered) while drive exists in any library type application. This condition hinders the ability to manually interact with the drive when the library port is being exercised. Root Cause: Incorrect SCSI sense information reported. Corrective Action: Change reported SCSI sense information from a (SK: 02, ASC: 04H, ASCQ:02) Load Command Required to (SK: 02, ASC: 04H, ASCQ:03) Manual Intervention Required. A4.24 Enhancement: Report personality information over the library port. This implementation mirrors what was done in the SDLT220 firmware. The firmware personality major code and firmware personality minor code are available via bytes 5 and 6 respectively in Tape Data Packet 3. Release Revision V83 A.4.25 Problem Description: When a Hard Write Error occurs, a subsequent command may endwith a SCSI Check Condition with a 0B/44/89 which indicates an internal error condition. Root Cause: The 0B/44/89 sense data is an indirect result of aborting the processes associated with buffered data not yet written to tape. If a SCSI command is received by the drive and this command is terminated with another SCSI Check Condition before the previous abort process is completed, the drive will then check the next command with a Corrective Action: The command reporting Check Condition for a Hard Write Error now disconnects and waits for abort processing to complete before reselecting and sending status. Release Revision V85 A.4.26 Enhancement: Added a low power mode used while waiting in the unlocked state for the door to be opened. If after 10 minutes the door has not been opened, the door will lock and the current holding the tension on the tape will be reduced. After entering this state, the eject button, SCSI unload, or library unload must take place to remove the cartridge. A SCSI load or library load will load the cartridge and remove the drive from the low power state. A.4.27 Problem Description: After the first un-buffered write command, all subsequent write commands would be in buffered mode. Root Cause: When optimizations were made to improve the write performance, un-buffered write mode was broken. Corrective Action: Do not allow the drive to enter optimized write mode when set to un-buffered mode. A.4.28 Problem Description: A space forward after a hard write error could take many hours to complete. Root Cause: The space was starting from beginning of tape. Corrective Action: Retain the track number after the write error so it may be used on a Space retry. A.4.29 Enhancement: To prevent the library port commands from filling the ring buffer, the information is only retained if it is different from the previous request. Release Revision V87 A.4.30 Problem Description: A particular software application can report a generic medium error (Error 174) when writing highly compressible data into the EOM area of tape. The drive correctly reports EOM/Early Warning/LEOT was encountered but incorrectly reports a residual of the block size minus one. All data is actually written to tape and the bug is in the reported residual. Root Cause: There is an implicit assumption in LEOT PBN check that physical blocks are in PBN order. With envelopes and block interleaving this assumption is not valid. If an envelope straddles the LEOT position, it may have blocks that are early in user-order being written after LEOT along with blocks that are later in user-order being written before LEOT. Corrective Action: The LEOT checker now uses the same PBN for every physical block in the envelope. The PBN used is the highest PBN encountered in the previous envelope. The envelope that actually straddles the LEOT position never reports EOM. Instead, EOM is reported when any block of the next envelope is examined. Release Revision V88 A.4.31 Problem Description: Drives can report an incorrect error count in LOG SENSE Page 3, parameters 2, 3, and 8000h. If the drive's partially-adjusted read error rate is consistently between 20 and 40 errors per gigabyte, the LOG SENSE values should be around 2 or 3 times the number of GB in the sample, but it may be reported as low as 0. Root Cause: Incorrect range constant in the Log Sense read firmware. Corrective Action: Use the correct constant. A.4.32 Enhancement: Add ability to write diagnostics information into a small area of the EEROM. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ B.0 Precautionary Statements ----------------------------- **WARNING:****WARNING:****WARNING:****WARNING:****WARNING:****WARNING:** B.1 The system MUST BE IDLE during the firmware download process! No other programs should be running while this utility is being used. Failure to do so may cause the devices being upgraded to fail or the system to crash. Any other computers sharing the same I/O bus as the host system must be either disconnected or offline. B.2 If any upgrade failures occur, do not continue upgrading devices. For example, loss of power during download will result in damaged peripherals and require replacement. If any failures occur, please collect the following log file: "/var/adm/messages", and an explorer dump. Please forward these files to your service provider for analysis. B.3 This package will only function on Quantum DLT8000 Tape Drives which were shipped/used in Sun StorEdge L20 DLT8000 Tape Libraries. B.4 Please READ instructions below completely BEFORE starting download procedure. Follow the procedures carefully. You may program multiple drives at the same time, however, you may not exit the utility until all drives have completed the download process. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C.O Patch Installation and Utility Usage Instructions ----------------------------------------------------- C.1 General guidelines for upgrading: EJECT MEDIA FROM DEVICE TO BE UPGRADED. Download utility will eject media from the device if it is found to be loaded. Do NOT attempt to force media back into the drive. Media present in a device having firmware downloaded to the device may result in data loss from media or damage to device. STOP ALL APPLICATIONS. The system must be idle during the firmware upgrade process. DISCONNECT or take OFFLINE any other computers sharing the same I/O bus as the host. UPGRADE the tape device. Follow the given instructions in the procedure section below. In case of any disruption or unforeseen events happening on the relevant bus during the firmware download process, it may be that the upgraded device becomes non-functional. In this event, it will be necessary to replace the device. This would happen as a result of an incomplete or corrupted firmware file being downloaded. Loss of power during the upgrade process would also damage the device. **NOTE** If you cannot upgrade devices due to software application interference, try booting off of the Solaris release CD. **NOTE** After the firmware download is completed, it may be necessary to power cycle the device to ensure fully resetting the device. In turn, this may also require a successive reboot of the host system to ensure all functionality is restored. C.2 Procedure for Tape Firmware Download: The procedure to be used for upgrading the device's firmware is explained below. Upgrade time will be approximately 3-5 minutes for each device. You must have root/super-user privileges in order to perform this operation. a). Unpack the patch (through tar) into any directory, e.g. /var/spool/patch (Note, if the patch ends in a ".Z" suffix, you will need to first uncompress it.) Example: # uncompress # tar xvf b). In the patch directory, as root, type the "tload" command: # ./tload DLT8000_V88_oml_3.img c). Select the tape device to be upgraded (see example below). **NOTE** This upgrade can result in error messages in the console window and/or the terminal "tload" window. It is normal for a SCSI bus reset message to appear in the console window for each device that is upgraded. d). Ensure that the device to be upgraded is the correct one and answer the question: Do you want to download firmware to this tape device [N]? with a 'y' for yes or anything else for no. Default answer is no. e). After each device has been upgraded, the displayed tape device list will be refreshed. Device(s) upgraded should reflect having the new code level, "0258", in the "Rev" field (see example below). f). If there is an additional device to be upgraded (same device type and desire to upgrade to the latest firmware), select that device as previously done in C.2.c). & C.2.d). above. Continue in this fashion until all desired devices have been upgraded. g). Quit the "tload" program by typing '0' (see example below). ************************************************************************** C.3 EXAMPLE # ######################################### # # Launch Tape Firmware Download Utility: # ######################################### # # ./tload DLT8000_V88_oml_3.img ATTACHED DEVICES: Device Supplier Product Rev Serial Number Device State -------------- -------- ---------------- ---- ------------- ------------- 1:0ln QUANTUM DLT8000 023B PMC22P0704 Available Select Device (0 to quit) [1]: 1 1:0ln QUANTUM DLT8000 023B PMC22P0704 Selected Do you want to download firmware to this device [N]? y ATTACHED DEVICES: Device Supplier Product Rev Serial Number Device State -------------- -------- ---------------- ---- ------------- ------------- 1:0ln QUANTUM DLT8000 023B PMC22P0704 Downloading Downloading /dev/rmt/0ln... please wait. ATTACHED DEVICES: Device Supplier Product Rev Serial Number Device State -------------- -------- ---------------- ---- ------------- ------------- 1:0ln QUANTUM DLT8000 023B PMC22P0704 Writing Flash Select Device(s) (ex: 1,3-4) or 0 to quit) [2]: 0 One or more devices has not yet completed flash update and/or download recovery time. Devices must not be utilized until after this recovery period has expired, or permanent damage may result. This program will release all devices and terminate in 3 minutes, 28 seconds. NOTE: Countdown timer will decrement all the way to 0 seconds. ATTACHED DEVICES: Device Supplier Product Rev Serial Number Device State -------------- -------- ---------------- ---- ------------- ------------- 1:0ln QUANTUM DLT8000 0258 PMC22P0704 F/W Upgraded # ######################################### # # The example above only upgrades one # device. You do not have to exit with # a "0" and initiate the 'tload' utility # again. You may continue instead and # directly upgrade the next tape device, # following the same steps as before # for each device until all devices # have been upgraded # ######################################### # # ######################################### # # After devices are upgraded, the Rev # will be 0258 # ######################################### # # ######################################### # # To Quit, enter '0'. System prompt # will return. # ######################################### # ************************************************************************** ************************************************************************** C.4 tload (ABOUT THE UTILITY): tload - Firmware Download utility for tape devices. SYNOPSIS tload [ filename ] [ -v ] filename firmware/microcode filename DESCRIPTION tload is an firmware download utility for Sun supported tape devices. If the firmware_file is specified, then it will display the list of tape devices present on the host system and asks the user to select the tape device which is to be upgraded. If the firmware_file is not specified, then it will display the list of tape devices present on the host system along with their FIRMWARE revision levels. tload will exit upon completion; please do not attempt to halt or stop prior to utility's menu exit option being presented. The command can be run only as a super-user. DISCLAIMER This utility is ONLY supported for downloading, to Sun supported tape devices, the Sun supported firmware binary (firmware_file) which has officially been released via the official Sun Patch Process. This utility is only supported with the release of firmware (binary) bundled with said patch. Do not attempt to use any other version of 'tload' that may have been acquired previously else device damage may occur. Use only the version provided with this patch. Use of tload to load non-Sun supported tape devices is at the user's own risk, and is not supported. Use of tload to load Sun supported tape devices with firmware NOT bundled with the utility in an officially released Sun Patch is at the user's own risk, and is not supported. PROBLEMS Any problems regarding this utility by the user following proper procedures should be reported to the user's service provider along with the following items: 1) /var/adm/messages file 2) explorer dump 3) ./tload DLT8000_V87_omx_106.img -v output README -- Last modified date: Thursday, April 29, 2004