Patch-ID# 116658-02 Keywords: security qstat qhost scheduler execd pe jobs qacct i18n l10n ssl Synopsis: Sun Grid Engine, Enterprise Edition 5.3 _x86: maint./security patch Date: Apr/06/2004 Install Requirements: None Solaris Release: 8_x86 9_x86 SunOS Release: 5.8_x86 5.9_x86 Unbundled Product: Sun Grid Engine Enterprise Edition Unbundled Release: 5.3 Xref: This patch is available for SPARC 32-bit as patch 113139 and for SPARC 64-bit as patch 113140 Topic: Relevant Architectures: i386 BugId's fixed with this patch: 4930786 4930789 4930793 4949917 4952236 4952767 4957760 4969825 5018669 5018695 5018726 5018733 5018757 5018884 5019595 5019601 5019624 5019635 5020131 5020134 5020139 5020141 5020143 5020153 5020278 5020371 5021405 Changes incorporated in this version: 4969825 5018669 5018695 5018726 5018733 5018757 5018884 5019595 5019601 5019624 5019635 5020131 5020134 5020139 5020141 5020143 5020153 5020278 5020371 5021405 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /bin/solaris86/qacct /bin/solaris86/qalter /bin/solaris86/qconf /bin/solaris86/qdel /bin/solaris86/qhost /bin/solaris86/qmake /bin/solaris86/qmod /bin/solaris86/qmon /bin/solaris86/qsh /bin/solaris86/qstat /bin/solaris86/qsub /bin/solaris86/qtcsh /bin/solaris86/sge_commd /bin/solaris86/sge_coshepherd /bin/solaris86/sge_execd /bin/solaris86/sge_qmaster /bin/solaris86/sge_schedd /bin/solaris86/sge_shadowd /bin/solaris86/sge_shepherd /bin/solaris86/sgecommdcntl /lib/solaris86/libXltree.so /utilbin/solaris86/adminrun /utilbin/solaris86/checkprog /utilbin/solaris86/checkuser /utilbin/solaris86/filestat /utilbin/solaris86/gethostbyaddr /utilbin/solaris86/gethostbyname /utilbin/solaris86/gethostname /utilbin/solaris86/getservbyname /utilbin/solaris86/infotext /utilbin/solaris86/loadcheck /utilbin/solaris86/now /utilbin/solaris86/openssl /utilbin/solaris86/qrsh_starter /utilbin/solaris86/rlogin /utilbin/solaris86/rsh /utilbin/solaris86/rshd /utilbin/solaris86/testsuidroot /utilbin/solaris86/uidgid Problem Description: 5021405 CSP reconnect problem of scheduler and execd 5020371 sge_shepherd creates world writable files 5020278 a colon in a job name breaks qacct 5020153 mail bomb upon abort with tightly integrated par jobs 5020143 qdel XXX.YY- will delete the first array task of job XXX 5020141 qsh and qlogin accepted the options -h and -hold_jid and ignored them later 5020139 a stored job template in qmon sets -hold_jid to a wrong value 5020134 qhost output broken for global consumables 5020131 renaming a user deletes the user 5019635 schedd_job_info=true causes large delays with parallel job scheduling 5019624 qselect/qstat -l selection wrongly considers load and utilization 5019601 "vmem" in qstat -j keeps the max value 5019595 Dateformat YYMMDDhhmm was interpreted wrong (qacct, qsub, qalter,...) 5018884 SSL vulnerabilities stated in Sun Alert 57524 5018757 HPCT jobs may fail - add variable to job environment which point to SGE binaries 5018733 Empty parameters crashes qstat and qhost 5018726 qalter lacks -dl option! 5018695 loadsensor doing output to stderr can block 5018669 qrsh/qlogin: "Connection refused" due to race condition in shepherd 4969825 not supported array task dependencies are not rejected (from 116658-01) 4930786 global load values are ignored 4930789 An overwritten string attribut was ignored in the scheduler 4930793 minor issues with the sgeee ticket update interval 4949917 qmon seg faults with a user hold job from qtcsh qtask file 4952236 Broken mail option with SGE 5.3p4 qrsh 4952767 qrsh -notify doesn't work 4957760 Fix needed for CERT CA-2003-26 Multiple Vulnerabilities in SSL/TLS Patch Installation Instructions: -------------------------------- For Solaris 2.0-2.6 releases, refer to the Install.info file and/or the README within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. For Solaris 7, 8, and 9 releases, refer to the man pages for instructions on using 'patchadd' and 'patchrm' scripts provided with Solaris. Any other special or non-generic installation instructions should be described below as special instructions. The following example installs a patch to a standalone machine: example# patchadd /var/spool/patch/104945-02 The following example removes a patch from a standalone system: example# patchrm 104945-02 For additional examples please see the appropriate man pages. Special Install Instructions: ----------------------------- Important note if Sun Grid Engine has been installed with openSSL support ------------------------------------------------------------------------- If Sun Grid Engine has been installed with openSSL support ("CSP mode") prior to SGEEE 5.3p3 (which was linked with openSSL 0.9.6.c), the certificates which have been installed with these versions are incompatible with certificates installed with SGEEE 5.3p4 or later. All such certificates will need to be recreated after installing this patch and before restarting Sun Grid Engine. Please refer to the Sun Grid Engine Administration and User Manual for how to create new certificates with the utiliy script "sge_ca", which comes with the distribution. The reason for the incompatibility is a changed field name between openSSL version 0.9.6 and 0.9.7 in the certificates, where "uniqueIdentifier" has been renamed to "userId". Patch Installation ------------------- These installation instructions assume that you are running a homogenous Sun Grid Engine, Enterprise Edition cluster where all hosts share the same directory for the binaries. If you are running Sun Grid Engine, Enterprise Edition in a heterogenous environment (mix of 32-bit and 64-bit binaries for Solaris and/or other operating systems) it is only necessary to shutdown the daemons for the architecture for which the patch is applied. If you installed the binaries on a local partition, you only need to stop the SGEEE daemons for that host on which you are installing the patch. By default there may by no running jobs when the patch is installed. There may pending batch jobs, but no pending interactive jobs (qrsh, qmake, qsh, qtcsh). It is possible to install the patch with running batch jobs. To avoid a failure of the active "sge_shepherd" binary it is necessary to move the old shepherd binary (and copy it back prior the installation of the patch). In no case it is supported to install the patch with running interactive jobs, 'qmake' jobs or with running parallel jobs which use the tight integration support (control_slaves=true in PE configuration is set). Stopping the Sun Grid Engine, Enterprise Edition cluster to start jobs ---------------------------------------------------------------------- Disable all queues that no new jobs are started: # qmod -d '*' Optional (only needed if there are running jobs which should continue to run when the patch is installed): # cd $SGE_ROOT/bin # mv /sge_shepherd /sge_shepherd.sge53 # cp -p /sge_shepherd.sge53 /sge_shepherd It is important that the binary is first moved and then copied back to the original location using the "-p" switch of the cp command. Shutting down Sun Grid Engine, Enterprise Edition qmaster and scheduler ----------------------------------------------------------------------- You need to shutdown (and restart) the qmaster and scheduler daemon and all execution daemons on all SGEEE hosts. Shutdown all your execution hosts. Login to all your execution hosts and stop the 'sge_execd' and 'sge_commd': # /etc/init.d/rcsge stop Then login to your qmaster machine and stop 'sge_qmaster', 'sge_schedd', 'sge_commd' and if the machine is also an execution host 'sge_execd' # /etc/init.d/rcsge stop Now verify with the 'ps' command that all Sun Grid Engine, Enterprise Edition daemons on all hosts are stopped. If you decided to rename the shepherd binary that running patch job continue to run during the patch installation you may not kill the 'sge_shepherd' binary. Installing the patch and restarting Sun Grid Engine, Enterprise Edition ----------------------------------------------------------------------- Now please install the patch with 'patchadd'. After installing the patch you need to restart your SGEEE cluster. Please login to your qmaster machine and enter: # /etc/init.d/rcsge Now you should repeat this step on all your execution hosts. After restarting SGEEE you may again enable your queues: # qmod -e '*' If you renamed the shepherd binary you may safely delete the old binary when all jobs finished which where running prior the patch installation. README -- Last modified date: Tuesday, April 6, 2004