Patch-ID# 112265-01 Keywords: SGE qsub qhost qmon scheduling Synopsis: Sun Grid Engine 5.2.3 maintenance patch Date: Nov/20/2001 Solaris Release: 2.6 7 8 SunOS Release: 5.6 5.7 5.8 Unbundled Product: Sun Grid Engine Unbundled Release: 5.2.3 Xref: Topic: suggested maintenance release for Sun Grid Engine 5.2.3 Relevant Architectures: sparc BugId's fixed with this patch: 4496678 4521393 4521398 4521400 4525977 4527558 4527559 4528905 Changes incorporated in this version: 4496678 4521398 4521393 4521400 4525977 4527558 4527559 4528905 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /bin/solaris/cod_commd /bin/solaris/cod_coshepherd /bin/solaris/cod_execd /bin/solaris/cod_qmaster /bin/solaris/cod_schedd /bin/solaris/cod_shadowd /bin/solaris/cod_shepherd /bin/solaris/codcommdcntl /bin/solaris/qacct /bin/solaris/qalter /bin/solaris/qconf /bin/solaris/qdel /bin/solaris/qhost /bin/solaris/qmake /bin/solaris/qmod /bin/solaris/qmon /bin/solaris/qsh /bin/solaris/qstat /bin/solaris/qsub /bin/solaris/qtcsh /examples Problem Description: 4496678 segmentation fault of qstat/qhost/qmon in combination with huge fragmented array 4521398 qmaster crashes if load report from execd of prior versions is sent to qmaster 4521393 qsub with empty job arguments causes qmaster crash 4521400 SGE scheduler performance decreases with huge clusters 4525977 non-stat()'able output directory causes queue error instead of job error 4527558 Interactive job submitted with the "-N" flags may show wrong job name in qstat 4527559 setting array job in qmon submit dialog in hold state is wrong 4528905 loadvalue "cpu" in Solaris(sparc) 7, 8 32 bit kernel not correct Patch Installation Instructions: -------------------------------- For Solaris 2.0-2.6 releases, refer to the Install.info file and/or the README within the patch for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. For Solaris 7-8 releases, refer to the man pages for instructions on using 'patchadd' and 'patchrm' scripts provided with Solaris. Any other special or non-generic installation instructions should be described below as special instructions. The following example installs a patch to a standalone machine: example# patchadd /var/spool/patch/104945-02 The following example removes a patch from a standalone system: example# patchrm 104945-02 For additional examples please see the appropriate man pages. Special Install Instructions: ----------------------------- Please visit our home page at http://www.sun.com/gridware for more information about the patches which update your Sun Grid Engine release 5.2.3 to 5.2.3.1 Make sure to install all patches for this maintenance release, including the patches for the "doc" and "common" package and all binary sets (Solaris 32-bit and 64-bit binares) as needed. Shutting down Sun Grid Engine ----------------------------- You can upgrade from 5.2.3 with pending jobs. So you just need to drain your cluster of running jobs by disabling all queues: # qmod -d '*' Shutdown your cluster with the following commands: # qconf -kej (shutdown execd and kill running jobs) (wait 1-2 minutes) # qstat -f (verify the status of the cluster) # qconf -ks (kill scheduler) # qconf -km (kill qmaster) # $CODINE_ROOT/util/shutdown_commd.sh -all (kill cod_commd's) (kill all cod_shadowd's) Now verify that all Sun Grid Engine daemons (cod_qmaster, cod_schedd, cod_execd, cod_commd, cod_shepherd, cod_shadowd) on all hosts are finished. If not, terminate them with the 'kill' command. Remove your execd spool directories ----------------------------------- This is a safe method to make sure that no hung jobs can cause any problems after the upgrade. The execd spool directory is configured through the global cluster configuration and has the unqualified host name appended. By default it is located in $CODINE_ROOT/default/spool/ You can recursively delete all these directories but please make sure NOT to delete the qmaster spool directory. After installing the patches read the file 'doc/UPGRADE' for more information how to update your startup script and restart Sun Grid Engine. README -- Last modified date: Tuesday, November 20, 2001