Patch-ID# 112481-15 Keywords: sun fire 15k security libesmd hwad esmd fomd Synopsis: SMS 1.2: fomd, hwad, esmd, pcd patch Date: Dec/01/2003 Install Requirements: See Special Install Instructions Solaris Release: 8 9 SunOS Release: 5.8 5.9 Unbundled Product: System Management Services Unbundled Release: 1.2 Xref: Topic: SMS 1.2: fomd, hwad, esmd, pcd patch libESMD Relevant Architectures: sparc BugId's fixed with this patch: 4474780 4508204 4520603 4527712 4527943 4529783 4533677 4533704 4548862 4549088 4558869 4559093 4592929 4593197 4614577 4615163 4616742 4617507 4618415 4618608 4618703 4620694 4622245 4624502 4629480 4632095 4632832 4633197 4634326 4638290 4638828 4639694 4641487 4641930 4643724 4647053 4649819 4654482 4655774 4657218 4659393 4666954 4667446 4670348 4670861 4671526 4671529 4671531 4671914 4673087 4673276 4674171 4674350 4674732 4675137 4676477 4680427 4684413 4699827 4708077 4715221 4717185 4719394 4722175 4722664 4724088 4724771 4724773 4725093 4730224 4732475 4733114 4734391 4734620 4734935 4735720 4735779 4743072 4747373 4748819 4750572 4751924 4754250 4755000 4759977 4760870 4765472 4795711 4799899 4803161 4822409 4825986 4830870 4834081 4844780 4844866 4848931 4865054 4865526 Changes incorporated in this version: 4830870 Patches accumulated and obsoleted by this patch: 112483-05 112484-02 112547-01 112599-01 112633-01 112768-01 112827-01 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /etc/opt/SUNWSMS/SMS1.2/config/esmd_tuning.txt /etc/opt/SUNWSMS/SMS1.2/config/fomd.cf /etc/opt/SUNWSMS/SMS1.2/startup/sms /etc/opt/SUNWSMS/SMS1.2/startup/ssd_start /opt/SUNWSMS/SMS1.2/bin/esmd /opt/SUNWSMS/SMS1.2/bin/fomd /opt/SUNWSMS/SMS1.2/bin/frad /opt/SUNWSMS/SMS1.2/bin/hwad /opt/SUNWSMS/SMS1.2/bin/marginvoltage /opt/SUNWSMS/SMS1.2/bin/pcd /opt/SUNWSMS/SMS1.2/bin/poweroff /opt/SUNWSMS/SMS1.2/bin/showenvironment /opt/SUNWSMS/SMS1.2/bin/showfailover /opt/SUNWSMS/SMS1.2/bin/smsconfig /opt/SUNWSMS/SMS1.2/lib/libESMD.so.1 /opt/SUNWSMS/SMS1.2/lib/libFOMD.so.1 /opt/SUNWSMS/SMS1.2/lib/libFRAD.so.1 /opt/SUNWSMS/SMS1.2/lib/libI2cCommProxy.so.1 /opt/SUNWSMS/SMS1.2/lib/libKeyswitch.so.1 /opt/SUNWSMS/SMS1.2/lib/libPower.so.1 Problem Description: 4830870 Requirement to add offset values for Cheetah+ and Cheetah++ (from 112481-14) 4834081 Workaround 4790336; latch mode boards will latch off 4844866 Inappropriate component replacement procedure can lead to a global dstop 4848931 hwad should be hardened against i2c timeouts while initializing to become MAIN (from 112481-13) 4865054 Ch++ temperature thresholds are incorrect 4865526 Patch to adjust VCore threshold for Cheetah+ and Cheetah++ (from 112481-12) 4548862 ESMD ambient temperature thresholds should be updated to match Env. Document 4825986 marginvoltage command blocks BAD exb and csb D116 power status 4844780 fomd forked processes hang (from 112481-11) 4666954 hwad does not set default value to HPCI cassette's slot_condition 4803161 console bus errors when maxcat in an expander with no SB (from 112481-10) 4708077 fomd error messages: RPC client create failed, Propagation/retrieval of x failed 4760870 if FOMD tries to fork/exec while logging a message, the forked process will hang 4795711 fomd references freed memory during data propagation for deleted files 4799899 esmd incorrectly reports which CP it powered off 4822409 in SMS1.2, fomd wastes system resources by creating many zombie threads (from 112481-09) 4730224 When switching between SMS 1.2 & SMS 1.3 need to relocate start-up script 4732475 SC failover must be disabled until bugid 4518965 implemented. 4750572 sms startup fails to correctly determine remote SC and requires .rhosts/.shosts 4751924 SUNWSMSr preinstall doesn't set proper primary GID for passwd entries on upgrade (from 112481-08) 4520603 frad traps SIGTERM and waits for fru update completion 4657218 setdatasync backup command may improperly overwrite files on the spare SC 4715221 Domain can't get valid TOD when multiple domains boot up at a same time 4722175 Unable to margin proc voltage 4724088 FRAD needs to handle ChkptProxy::chkptRead() error code 6 differently 4724773 sending poweron/off fru events always returns an error. 4725093 Checkpoint mechanism should clean-up list when initially starts up 4734391 frad fails the handle checkpoint of CP0 at startup 4734620 LED's on SC PCI slots not reflecting SC status correctly 4734935 LED's on SC CSB (carrier-plate] conflict with other LED's on SC in same slot 4735720 Expander power latch bits get overwritten during poweron command 4735779 SMS needs to increase the wait time during its auto-connect sequence 4743072 CH++ MCPU core nominal voltage is incorrect upon margining. 4747373 fomd needs to implement fix for MAN Ether hub configuration 4748819 On main SC overlimit temp esmd powers off entire box even if other SC is avail 4754250 esmd doesn't turn off SC LEDs when it powers off local SC due to env. condition 4759977 esmd should poweron SC peripheral board when SC board is inserted (from 112481-07) 4620694 when SMS starts and cannot find expected domains, setkeyswitch will hang 4733114 hpci boards showing up as "Unknown" in showboards output 4765472 Unable to read fruid container on SCPER. (from 112481-06) 4527712 poweroff other SC should check if it provides clk src to any running domain brds 4533704 esmd can be out of step w/ actual power state of rev 2 MCPU boards for ~30 secs 4559093 ESMD needs to monitor clock phase lock 4617507 ESMD should set/clear MAN_OVERRIDE based on component's clock inputs 4622245 Not all fomd resets attempt a graceful shutdown first 4632832 hwad will return timeout error reading temps, even tho it has valid data 4649819 request operator to confirm poweroff of bulk power supply; poweron of BPS is NOO 4654482 hwad deadlocks because of 4647410, need to implement workaround 4684413 takeover in response to env condition on main SC/SCPER does not seem to work 4699827 Deconfigure L1 boards should reset Darb ports if necessary 4724771 LibPower should send events sychronously 4755000 sms startup script does not fully handle return code 255 from ssh. (from 112481-05) 4508204 fomd shouldn't use R* services 4529783 fomd core dumps logging an error message after failing to copy a file 4647053 sms start fails with no reason given when the gchip/echip/consbus fail scpost 4719394 "fomd" logs the message " rsh enabled " when "rsh" and "ssh" are both disabled 4722664 fomd should check UID in .fomd_uids.cf when verifying ssh. (from 112481-04) 4667446 setdatasync backup fails - internal error 8551 4671914 misleading message in platform log: Failover is in a failed state because... 4673276 Domains unstable after forced main SC to panic to test SC failover 4674732 Propagation/retrieval of xxx failed - unable to create transfer file 4675137 failover may not fully activate when setfailover on is run (after deactivation) 4676477 showfailover -v and log file output (showlog) do not indicate same status (from 112481-03) 4629480 "attach ready" state of system boards must be cleared when no domains are active 4670861 New board which is inserted gets old board's board states. Can prove confusing. (from 112481-02) 4634326 Power should set PCI cassette LEDs at board power on time 4643724 Cmd reissue timeout/AMX flow control skew when doing DBR after SC failover 4659393 DStop: Slot0 target slot transgression error when MCPU powered off in split-slot 4671526 libPower needs to clear board test status when boards are reset 4671529 deconfiguring boards in a domain with split-slot mcpu can dstop other domain 4674350 MCPU boards were powered off causing domain DStop after forced failover (from 112481-01) 4527943 AMX0=32768 and AMX1=0 on Centerplane 1 are not equal 4618608 Stale test status for MCPU board causes cpu/mem board on same exp to fail POST 4618703 SMS power control needs to preserve "attach-ready" state of L1 boards (from 112599-01) 4614577 PCD not being propagated to SPARE 4615163 fomd uses significant amount of CPU resources on s8u6 (from 112827-01) 4671531 libKeyswitch needs to deconfigure L1 boards before the expander. (from 112483-05) 4670348 Power mode bit on the expander should be in latch mode 4680427 Support for Ch++ 4717185 Showenvironment used the wrong core voltage threshold in maxcat board (from 112483-04) 4592929 marginvoltage hPci should report margin status based on status bit not volt read 4673087 stopping sms on both SCs siumltaneously causes global domain dstop (from 112483-03) 4474780 dxs hwad error message seen in domain logfile when pci device disconnected 4638290 enable/disable 66Mhz methods in hot plug go to the worng board 4638828 setPCIStatus fails for a set 33MHz request for 33MHz slots (from 112483-02) 4593197 hwad core dumped 4618415 SMS deadlocked as one of the components did not release a global lock 4624502 Occasional lock timeouts still seen (from 112483-01) 4549088 hwad gethwadClientId() can return wrong id, cause hwad to hang 4616742 Degrade bus to some exps (not all) could cause INTR not to work for these boards (from 112484-02) 4655774 showenvironment cannot see remote SCPER ambient temperature (from 112484-01) 4558869 esmd does not unregister w/ hwad in all transient threads (from 112633-01) 4641930 Env. values need to be modified for new fan tray (from 112768-01) 4639694 add segment fails in read-only section (write jumper enabled) 4641487 fru_write_segment() fails. 4674171 static data written to dynamic section. (from 112547-01) 4533677 smsrestore destroys acl for directories, disabling many or all sms commands 4632095 smsrestore does not translate backed up file to current version 4633197 SUNWSMSr package fails to copy over the .fomd_uids.cf file during upgrade Patch Installation Instructions: -------------------------------- For Solaris 2.0-2.6 releases, refer to the Install.info file and/or the README within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. For Solaris 7-8 releases, refer to the man pages for instructions on using 'patchadd' and 'patchrm' scripts provided with Solaris. Any other special or non-generic installation instructions should be described below as special instructions. The following example installs a patch to a standalone machine: example# patchadd /var/spool/patch/104945-02 The following example removes a patch from a standalone system: example# patchrm 104945-02 For additional examples please see the appropriate man pages. See Also: System Management Services (SMS) 1.2 Installation Guide and Release Notes, Part No. 816-4957-10 (Solaris 8) Part No. 816-3269-10 (Solaris 9) Chapter 1, Patches Available at: http://www.sun.com/products-n-solutions/hardware/docs/Servers/High-End_Servers/Sun_Fire_15K/SW_FW_Documentation/SMS/index.html Special Install Instructions: ----------------------------- Follow these steps when installing on the SC: 1. Record which SC is the main SC. 2. Disable failover on MAIN SC (setfailover off). 3. Stop the SMS processes on both SC's simultaneously. /etc/init.d/sms stop 4. Install the patch on both SC's. 5. Start the SMS processes on the previous main SC first. /etc/init.d/sms start 6. After all the sms processes have started (i.e. you're able to run the showenvironment command and get all the system's status), start the SMS processes on the Spare SC next. 7. Enable failover on MAIN SC (setfailover on). Special Installation instructions if patch 112481-05 or higher has not yet been installed. Once this installation has completed, future revs of this patch can be installed per the instructions above. Patch 112481-05 (or higher) enables the ability to use ssh/scp (secure shell/secure copy). If ssh/scp are to be used, the patch must be applied first, then the ssh/scp packages. For example: Reference: [1] Securing the Sun Fire 12K and 15K System Controllers http://www.sun.com/blueprints Fresh Install OS, SMS 1.2, and the security fomd patch ------------------------------------------------------ 1. Install OS 2. Install SMS 1.2 3. Install secure fomd patch on both SCs 4. Configure the MAN network and SMS user groups 5. Reboot 6. Disable failover % setfailover off 7. Install openssh on both SCs (optional for Solaris 9) [1]: "To Download OpenSSH Software" 8. Configure Secure Shell on both SCs [1]: "To Use fomd With Secure Shell Instead of r*" 9. Create softlinks under /opt/SUNWSMS/SMS/bin/ to identify the absolute locations of ssh and scp binaries on both SCs: % cd /opt/SUNWSMS/SMS/bin/ % ln -s ssh % ln -s scp Note: Fomd falls back to rsh/rcp when not finding ssh/scp in /opt/SUNWSMS/SMS/bin/, /usr/bin/, and /opt/OBSDssh/bin/ 10. Enable failover % setfailover on Install the secure fomd patch: ------------------------------ 1. Record which SC is the main SC 2. Disable failover on MAIN SC % setfailover off 3. Stop SMS on both SC's simultaneously 4. Install the secure fomd patch on both SC's 5. Install openssh on both SCs (optional for Solaris 9) [1]: "To Download OpenSSH Software" 6. Configure Secure Shell on both SCs [1]: "To Use fomd With Secure Shell Instead of r*" 7. Create softlinks under /opt/SUNWSMS/SMS/bin/ to identify the absolute locations of ssh and scp binaries on both SCs: % cd /opt/SUNWSMS/SMS/bin/ % ln -s ssh % ln -s scp Note: Fomd falls back to rsh/rcp when not finding ssh/scp in /opt/SUNWSMS/SMS/bin/, /usr/bin/, and /opt/OBSDssh/bin/. 8. Reboot the SCs on the previous MAIN SC first 9. Reboot the SCs on the previous SPARE SC next 10. Enable failover on the MAIN SC % setfailover on ------------------------------------- Post Install instructions: After applying patch 112481-02 (or higher), all expanders not servicing active domains must be power cycled. This is to ensure that bug 4671526 is cleared from the system. The simplest method to do this is: % poweron -n ; poweroff -n The '-n' will answer NO to any query presented by the power commands, thus avoiding redundant power on actions, and protecting active domains from interruption. -------------------------------------- After applying patch 112481-09 or higher: run "smsconfig -m" accept all of the default network settings. -------------------------------------- NOTE RE: UltraSPARC III++ In order for the UltraSPARC III++ to work fully with this SMS version, Please also install Post patch id : 112488-07 or higher. Post changes recognize the new proc type. Please refer to : RFE: 4661795 POST support for new UltraSPARC III+ processors for more info. README -- Last modified date: Monday, December 1, 2003