Patch-ID# 107523-04 Keywords: y2000 max_total_procs RTE MPI large-proc large-task manpage procbind Synopsis: HPC2.0: Critical fixes and updates for RTE and MPI Date: Feb/04/00 Solaris Release: 2.5.1 2.6 SunOS Release: 5.5.1 5.6 Unbundled Product: HPC Unbundled Release: 2.0 Relevant Architectures: sparc NOTE: sun4u BugId's fixed with this patch: 4108989 4105643 4102618 4102426 4102082 4127300 4196459 4134004 4106335 4104172 4098183 4126940 4120847 4131087 4224423 4249505 4308243 Changes incorporated in this version: 4308243 Patches accumulated and obsoleted by this patch: 106066-03 106082-02 106346-01 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /opt/SUNWhpc/HPC2.0/bin/tminfo /opt/SUNWhpc/HPC2.0/bin/tmkill /opt/SUNWhpc/HPC2.0/bin/tmps /opt/SUNWhpc/HPC2.0/bin/tmrun /opt/SUNWhpc/HPC2.0/bin/tmsub /opt/SUNWhpc/HPC2.0/etc/in.tmproxyd /opt/SUNWhpc/HPC2.0/etc/tm.mpmd /opt/SUNWhpc/HPC2.0/etc/tm.omd /opt/SUNWhpc/HPC2.0/etc/tm.rdb /opt/SUNWhpc/HPC2.0/etc/tm.spmd /opt/SUNWhpc/HPC2.0/etc/tm.watchd /opt/SUNWhpc/HPC2.0/etc/tmadmin /opt/SUNWhpc/HPC2.0/lib/librte.so.1 /opt/SUNWhpc/HPC2.0/man/man8/tmadmin.8 /opt/SUNWhpc/HPC2.0/lib/libfmpi.so.1 /opt/SUNWhpc/HPC2.0/lib/libmpi.so.1 /opt/SUNWhpc/HPC2.0/lib/libmpi_mt.so.1 /opt/SUNWhpc/HPC2.0/lib/libpmpi.so.1 /opt/SUNWhpc/HPC2.0/man/man3/MPI.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Abort.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Address.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Allgather.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Allgatherv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Allreduce.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Alltoall.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Alltoallv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Attr_delete.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Attr_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Attr_put.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Barrier.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Bcast.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Bsend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Bsend_init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Buffer_attach.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Buffer_detach.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cancel.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_coords.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_map.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_rank.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_shift.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cart_sub.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Cartdim_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_compare.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_dup.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_group.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_rank.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_remote_group.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_remote_size.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_size.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_split.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Comm_test_inter.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Dims_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Error_class.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Errhandler_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Errhandler_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Errhandler_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Errhandler_set.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Error_string.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Finalize.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Gather.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Gatherv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Get_count.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Get_elements.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Get_processor_name.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graph_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graph_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graph_map.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graph_neighbors.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graph_neighbors_count.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Graphdims_get.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Ibsend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_compare.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_difference.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_excl.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_incl.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_intersection.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_range_excl.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_range_incl.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_rank.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_size.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_translate_ranks.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Group_union.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Initialized.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Intercomm_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Intercomm_merge.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Iprobe.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Irecv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Irsend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Isend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Issend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Keyval_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Keyval_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Op_create.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Op_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Pack.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Pack_size.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Pcontrol.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Probe.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Recv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Recv_init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Reduce.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Reduce_scatter.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Request_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Rsend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Rsend_init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Scan.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Scatter.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Scatterv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Send.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Send_init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Sendrecv.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Sendrecv_replace.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Ssend.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Ssend_init.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Start.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Startall.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Test.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Test_cancelled.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Testall.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Testany.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Testsome.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Topo_test.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_commit.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_contiguous.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_extent.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_free.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_hindexed.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_hvector.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_indexed.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_lb.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_size.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_struct.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_ub.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Type_vector.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Unpack.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Wait.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Waitall.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Waitany.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Waitsome.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Wtick.3 /opt/SUNWhpc/HPC2.0/man/man3/MPI_Wtime.3 Problem Description: 4308243 Patchid 107523-03 missing librte (from 107523-03) 4249505 max_total_procs is ineffective in limiting jobs per Node (from 107523-02) 4224423 RTE task database clean up is not Y2K compliant (from 107523-01) 4134004 Setuid executables may not run (error: Device busy) (from 106066-03) 4196459 page locks causing DR drain and detach ioctl failures 4102082 TMRTE_mlock performance problem 4127300 mpi pbind changes will need admin override enhancement in tmadmin 4108989 tmrun cannot execute with long hostnames 4105643 tm.rdb runs out of file descriptors Note: To get the full fix for this large task [procs] problem you should install patch 106082 also. 4102618 tm.mpmd gets bus error when tmsub is executed 4102426 tm.mpmd can get bus error on tmrun or tmsub 4079858 tm.rdb enforces strict nodename compliance for master name (from 106082-02) 4126940 performance within a large SMP can be improved by binding to processors 4120847 MPI broadcast optimizations for single SMP 4098183 unpacking of partial receives of hvectors is done incorrectly 4104172 truncating error not reported when unpacking a non-contiguous datatype 4106335 RTE file descriptor fix for large tasks benefits from mpi changes (from 106346-01) 4131087 large number of man page edits/changes are needed Patch Installation Instructions: -------------------------------- Refer to the Install.info file for instructions on using the generic 'installpatch' and 'backoutpatch' scripts provided with each patch. Any other special or non-generic installation instructions should be described below as special instructions. Special Install Instructions: ----------------------------- This patch should not be applied when any MPI, HPF, or S3L tasks are active on the system (due to the MPI .so changes). "/opt/SUNWhpc/bin/tmps -Ae" should not report any active tasks. If you install this patch while tasks are active, it's possible that they will die from a SEGV or BUS error. Processor binding functionality was introduced with the fix for 4126940. Users must set MPI_PROCBIND in their shell environment from which they execute the tmrun of their HPC tasks. The HPC administrator must also set allow_pbind, through tmadmin, in the partition that will be used for the task. Warning: performance gains by setting MPI_PROCBIND have been demonstrated on dedicated single SMP partition; performance on other systems is less likely to be improved and may show degradation. If you wish to allow the use of MPI_PROCBIND (bugid: 4127300) you will need to set the "allow_pbind" attribute within a partition using tmadmin. For example to set this value in a partition called "marc-ded" you could use the following command on the master node: # tmadmin -c "partition marc-ded set allow_pbind" You can deny users access to MPI_PROCBIND functionality by using the unset command: # tmadmin -c "partition marc-ded unset allow_pbind" Note for BugId 4196459: the fixes made for this DR problem will only work if you have rebooted the domain after the patch is installed. Merely restarting the daemons will still leave the locked page. To add this patch you must stop the HPC system (due to the RTE .so lib change). You can stop the HPC system as follows: 1a) Stop the RTE daemons. On each node (including master) # /etc/init.d/rte.node stop Then on the master node: # /etc/init.d/rte.master stop If you have an existing RTE database and have experienced problems such as in bugid 4108989 then you will need to delete the RTE rdb database first. This can be accomplished by the following command on the master: # rm /var/hpc/rdb-old /var/hpc/rdb-save Again, you only have to remove the RTE rdb database if you were experiencing hostname related startup problems before. 1b) If you are using PFS, stop the PFS daemons on each node (including master): # /etc/init.d/sunhpc.pfs stop 2) Add patches to each node according to the Install.info file included with this patch. For clusters this may be easiest with Cluster Console (cconsole or ctelnet). 3a) Start the RTE daemons. On the master node: # /etc/init.d/rte.master start Then on each node (including master) # /etc/init.d/rte.node start 3b) If you are using PFS, start the PFS daemons on each node (including master): # /etc/init.d/sunhpc.pfs start >