30 likes | 294 Views
MPI without tight integration. Node 01. Node 02. < pbs_mom > tracked session. < sshd / rshd >. <Job script> #PBS –l select=3:ncpus=2:mpiprocs=2 … … m pirun - hostfile $PBS_NODEFILE a.out. a.out. < pbs_mom >. a.out. a.out. Node 03. a.out. < sshd / rshd >. s sh / rsh node02 .
E N D
MPI without tight integration Node 01 Node 02 <pbs_mom> tracked session <sshd/rshd> <Job script> #PBS –l select=3:ncpus=2:mpiprocs=2 … … mpirun -hostfile $PBS_NODEFILE a.out a.out <pbs_mom> a.out a.out Node 03 a.out <sshd/rshd> ssh/rsh node02 a.out <pbs_mom> ssh/rsh node03 a.out PBS do not know about processes on nodes 02 and 03 because processes there get generated outside of PBS scope.
MPI with pbs_remsh/pbs_attach Node 02 <sshd/rshd> Node 01 <pbs_mom> tracked session pbs_attach pbs_attach a.out a.out <Job script> #PBS –l select=3:ncpus=2:mpiprocs=2 … … mpirun -r pbs_remsh -hostfile $PBS_NODEFILE a.out <pbs_mom> a.out Node 03 <sshd/rshd> a.out pbs_attach pbs_attach ssh/rsh node02 a.out a.out ssh/rsh node03 <pbs_mom> pbs_remsh (see inside it) launches something like “sshnodeXXpbs_attach –j JOBID a.out”, informing pbs_mom on the machine process a.out being launched belongs to JOBID.
MPI with pbs_tmrsh Node 02 Node 01 a.out <pbs_mom> tracked session <pbs_mom> <Job script> #PBS –l select=3:ncpus=2:mpiprocs=2 … … mpirun -r pbs_remsh -hostfile $PBS_NODEFILE a.out a.out a.out Node 03 a.out a.out pbs_tmrsh node02 <pbs_mom> pbs_tmsh node03 a.out pbs_tmrsh talks directly to pbs_mom, using PBS task management library and a.out processes are launched by pbs_mom directly.