1 / 23

Hands on Special Jobs

Hands on Special Jobs. Wu Wenjing IHEP – Beijing Grid tutorial for users Beijing, 25-26 Nov 2006. MPI job Job with Data. Outline. A example of MPI job. You can copy the resource code of the job from the tmp Directory [gilda07] /home/beijing03 > cp -r /tmp/hello/ .

Download Presentation

Hands on Special Jobs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hands on Special Jobs Wu Wenjing IHEP – Beijing Grid tutorial for users Beijing, 25-26 Nov 2006

  2. MPI job Job with Data Outline

  3. A example of MPI job

  4. You can copy the resource code of the job from the tmp Directory [gilda07] /home/beijing03 > cp -r /tmp/hello/ . [gilda07] /home/beijing03 > ls hello/ hello_mpi.c hello_mpi.jdl hello_mpi.o hello_mpi.sh hello_mpi.X makefile [gilda07] /home/beijing01 > cp -r /tmp/data . [gilda07] /home/beijing01 > ls data/ data.jdl WMS-text2.conf WMS-text.conf Good Luck! Have a try!

  5. Hello_mpi.c(compiled to a executable program –hello_mpi.X) #include <stdio.h> #include <stdlib.h> #include "mpi.h" int main(int argc, char **argv) { int rank, size; char host_name[20]; An example of MPI job

  6. MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); gethostname(host_name, 20); printf("I am processor: %d at %s\n",rank,host_name); MPI_Finalize(); return(0); } A example of MPI job(cont)

  7. A MPI job includes usually three parts: • A C program or The executable program compiled by the C program • wrapper script that invokes the MPI applications by calling mpirun command. • The JDL file . A example of MPI job(cont)

  8. Hello_mpi.sh #!/bin/sh -x # Binary to execute EXE=$1 CPU_NEEDED=$2 echo "***********************************************************************" echo "Running on: $HOSTNAME" echo "As: " `whoami` if [ -f "$PWD/.BrokerInfo" ] ; then TEST_LSF=`edg-brokerinfo getCE | cut -d/ -f2 | grep lsf` else TEST_LSF=`ps -ef | grep sbatchd | grep -v grep` fi A example of MPI job(cont)

  9. if [ "x$TEST_LSF" = "x" ] ; then # prints the name of the file containing the nodes allocated for parallel execution echo "PBS Nodefile: $PBS_NODEFILE" # print the names of the nodes allocated for parallel execution cat $PBS_NODEFILE echo "*************************************" HOST_NODEFILE=$PBS_NODEFILE else # print the names of the nodes allocated for parallel execution echo "LSF Hosts: $LSB_HOSTS" # loops over the nodes allocated for parallel execution HOST_NODEFILE=`pwd`/lsf_nodefile.$$ for host in ${LSB_HOSTS} do host=`host $host | awk '{ print $1 } '` echo $host >> ${HOST_NODEFILE} done fi A example of MPI job(cont)

  10. cat ${HOST_NODEFILE} echo "*************************************" # prints the working directory on the master node echo "Current dir: $PWD" echo "*************************************" for i in `cat $HOST_NODEFILE` ; do echo "Mirroring via SSH to $i" # creates the working directories on all the nodes allocated for parallel execution ssh $i mkdir -p `pwd` # copies the needed files on all the nodes allocated for parallel execution /usr/bin/scp -rp ./* $i:`pwd` # checks that all files are present on all the nodes allocated for parallel execution echo `pwd` ssh $i ls `pwd` A example of MPI job(cont)

  11. #setsthe permissions of the files ssh $i chmod 755 `pwd`/$EXE ssh $i ls -alR `pwd` echo "@@@@@@@@@@@@@@@" done echo "***********************************************************************" echo "Executing $EXE with mpirun" chmod 755 $EXE /opt/mpich/bin/mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE A example of MPI job(cont)

  12. Hello_mpi.jdl Type = "Job"; JobType = "MPICH"; NodeNumber = 2; Executable = "hello_mpi.sh"; Arguments = "hello_mpi.X 4 "; StdOutput = "hello.out"; StdError = "hello.err"; InputSandbox = {"hello_mpi.sh","hello_mpi.X"}; OutputSandbox = {"hello.err","hello.out"}; Requirements = (other.GlueCEInfoLRMSType == "PBS") || (other.GlueCEInfoLRMSType == "LSF") A example of MPI job(cont)

  13. [gilda07] /home/liuag/EuChinaGrid/hello > edg-job-submit --vo gilda hello_mpi.jd Selected Virtual Organisation name (from --vo option): gilda Connecting to host gilda05.ihep.ac.cn, port 7772 Logging to host gilda05.ihep.ac.cn, port 9002 ********************************************************************************************* JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is: - https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ ******************************************************************************** How to submit a MPI job

  14. [gilda07] /home/liuag/EuChinaGrid/hello > edg-job-status https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: grid-ce.bio.dist.unige.it:2119/jobmanager-lcgpbs-long reached on: Tue Nov 21 07:27:03 2006 ************************************************************* Get the status of the submitted job

  15. [gilda07] /home/liuag/EuChinaGrid/hello > edg-job-get-output https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ Retrieving files from host: gilda05.ihep.ac.cn ( for https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ ) ********************************************************************************* JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https://gilda05.ihep.ac.cn:9000/xxu2ktFg3ChY2K-KTnxrIQ have been successfully retrieved and stored in the directory: /tmp/jobOutput/liuag_xxu2ktFg3ChY2K-KTnxrIQ ******************************************************************************** Get the output of the finished job

  16. [gilda07] /home/liuag/EuChinaGrid/hello > more /tmp/jobOutput/liuag_xxu2ktFg3ChY 2K-KTnxrIQ/hello.out ......... *************************************************************** Executing hello_mpi.X with mpirun I am processor: 2 at grid-wn03.bio.dist.u I am processor: 3 at grid-wn03.bio.dist.u I am processor: 1 at grid-wn03.bio.dist.u I am processor: 0 at grid-wn03.bio.dist.u The output of the job

  17. VirtualOrganisation = "gilda"; Executable = "/bin/echo"; Arguments = “Hello everyone"; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"}; DataCatalog = "http://lfc-gilda.ct.infn.it:8085"; InputData = {"lfn:/grid/gilda/provaK"}; DataAccessProtocol = {"gridftp","rfio","gsiftp"}; An example of the Job with data

  18. VirtualOrganisation = "gilda"; Executable = "/bin/echo"; Arguments = "hello everyone"; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"}; DataCatalog = "http://lfc-gilda.ct.infn.it:8085"; InputData = {"lfn:/grid/gilda/scardaci/BoxData.txt"}; DataAccessProtocol = {"gridftp","rfio","gsiftp"}; An example of the Job with Date

  19. As the LCG flavor RB can not accept the Job with Date,We need to appoint our job to a GLITE flavor RB.There are some differences by submitting the job. • prepare a configuration file to appoint the RB . submit the job # more WMS-text2.conf [ VirtualOrganisation = "gilda"; NSAddresses = "glite-rb2.ct.infn.it:7772"; LBAddresses = "glite-rb2.ct.infn.it:9000"; ]

  20. submit the job with the command “glite-job-submit” and with the option “–config-vo” #glite-job-submit --config-vo WMS-text2.conf newdata.jdl Selected Virtual Organisation name (from proxy certificate extension): gilda Connecting to host glite-rb2.ct.infn.it, port 7772 Logging to host glite-rb2.ct.infn.it, port 9002 ********************************************************************************************* JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use glite-job-status command to check job current status. Your job identifier is: - https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA ********************************************************************************************* submit the job We are using a GLITE RB

  21. #glite-job-status https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: grid010.ct.infn.it:2119/jobmanager-lcgpbs-long Submitted: Tue Nov 21 18:04:06 2006 CST ************************************************************* query the status of the job

  22. glite-job-output https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA Retrieving files from host: glite-rb2.ct.infn.it ( for https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA ) ********************************************************************************* JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https://glite-rb2.ct.infn.it:9000/6H6MxCv4gDHQh6To0y32fA have been successfully retrieved and stored in the directory: /tmp/liuag_6H6MxCv4gDHQh6To0y32fA ********************************************************************************* get the output

  23. The output of the job: Output of the job more /tmp/liuag_6H6MxCv4gDHQh6To0y32fA/std.out hello everyone

More Related