180 likes | 360 Views
INFN - Ferrara. BaBar Meeting SPGrid: status in Ferrara. Enrica Antonioli - Paolo Veronesi Ferrara, 12/02/2003. Topics. The DataGrid project Ferrara Farm Configuration First SP submissions through the Grid Work in Progress Future Plans Conclusions. INFN-GRID. Manchester.
E N D
INFN - Ferrara BaBar Meeting SPGrid: status in Ferrara Enrica Antonioli - Paolo Veronesi Ferrara, 12/02/2003
Topics • The DataGrid project • Ferrara Farm Configuration • First SP submissions through the Grid • Work in Progress • Future Plans • Conclusions
INFN-GRID Manchester • Special project of INFN To Russia/Japan R.A.L • 2001- 2003 To USA • To manage and use computing resources distributed on Garr-b sites Cern PD MI FE BO TO ROMA • Deployment of Testbed sites, in order to validate EDG software release and to adapt them to High Energy Physics requests CA Current prototipe of INFN DataGrid testbed connected to EDG testbed – US and ASIA CT European DataGrid and INFN-GRID EDG • 2001 - 2003 • Funded by European Union • Computing Grids permit: • High Throughput Computing • Analysis of large dimension data • Sharing resources and data • Applications involved: • Biomedical Sciences • Earth Observation • High Energy Physics
APPLICATION Layer DataGRID Architecture ALICE ATLAS CMS LHCb BaBar High level GRID middleware GLOBUS toolkit Basics Services EDG Architecture and Services OS & Net services
CE/WN UI LCFGng Server SE Grid Elements in Ferrara • TheDataGrid Testbed consists of different types of machines (Grid Elements). • In Ferrara the farm is composed by one Computing Element (CE), three Worker Nodes (WN), one User Interface (UI) and one Storage Element (SE). • All these machines are managed by a LCFGng (Local ConFiGuration system new generation) server and they are automatically configured.
CertificateAuthorities User Interface • UI (User Interface):component for accessing to the workload management system. • Users can submit a job and retrieve the output, they sholud have an account and a personal certificate installed in their home directory. To access the GRID you have to request a certificate to a certification authority. INFN-GRID users can obtain a certificate from the INFN CA (http://security.fi.infn.it/). UI To use the BaBar Grid, you must register that certificate with the BaBar Virtual Organisation (BaBar VO). http://www.slac.stanford.edu/BFROOT/www/Computing/Offline/BaBarGrid/registration.html
Replica Catalogue Information Service (IS) Job Status Input Sandbox submitted UI JDL waiting Job Submit Event Output Sandbox ready Resource Broker (RB) Input Sandbox scheduled Logging & Book-keeping (LB) done running Job Submission Service (JSS) outputready Computing Element cleared Job Status Job Status Job Status Output Sandbox Job Submission Storage Element
250 GB RB SCSI SE Data server LCFGng Server UI Management R A I D 0 Lock server CE-WN SPGrid Farm in Ferrara CNAF - Bologna CERN Ferrara - EDG 1.4.3
Configuration • INFN Grid Testbed Status: EDG 1.4.3 (RedHat 6.2). • A BaBar software special release (12.3.2y) has been built and installed to: • Write Kanga files • Run Moose on RH 6.2 • A special tag of ProdTools has been installed to perform tests. • A pool of BaBar accounts (babar000, babar001,…) has been created in the EDG farm of Ferrara. • Each member of BaBar VO is able to submit jobs to the farm of Ferrara through the RB located at CNAF (grid009g.cnaf.infn.it).
Current Status • Created a JDL file to run Moose on Grid resources. • Created scripts containing EDG commands to submit jobs, to check their status and retrieve output files. • An user can submit a range of runs. • For each run a job is created and submitted to the Resource Broker, then it is sent to the Ferrara CE (grid0.fe.infn.it). • The output file is then transferred to the closest SE (grid2.fe.infn.it).
Moose.jdl Similar to SP standard scripts (Job.Xsh) grid1> more Moose.jdl Executable ="Moose.csh"; InputSandbox ={"Moose.csh",".cshrc","config.csh"}; StdOutput ="Moose.txt"; StdError ="Moose.log"; OutputSandbox ={"Moose.txt","Moose.log"}; Config file for BaBar. Similar to SP standard scripts General environment configurations Globus command: To copy output files from WN to SE […] tar -czvf run${RUNNUM}.tar.gz *.root globus-url-copy -vb file://`pwd`/run${RUNNUM}.tar.gz \ gsiftp://grid2.fe.infn.it/flatfiles/SE00/paolo/run${RUNNUM}.tar.gz
The launch script grid1> more launch #!/bin/tcsh -v @ num_f = $1 @ fin = $2 while ( $num_f <= $fin ) ####build the run directories […] ####build a config.csh with the appropriate environment variables echo "#\!/bin/tcsh -v" > config.csh […] #### now run the job dg-job-submit -o run$num_f.jobid -r \ grid0.fe.infn.it:2119/jobmanager-pbs-long Moose.jdl cd .. @ num_f++ end For each run a job is created runtime Range of runs to submit A config file is created for each run EDG job submission command
grid1> ls 1962016/ Moose.csh Moose.jdl config.csh run1962016.jobid grid1> ls 1962017/ Moose.csh Moose.jdl config.csh run1962017.jobid Job Submission Range of runs to submit grid1> ./launch 1962016 1962017 […] dg-job-submit -o run$num_f.jobid -r grid0.fe.infn.it:2119/jobmanager-pbs-long Moose.jdl Connecting to host grid009g.cnaf.infn.it, port 7771 Logging to host grid009g.cnaf.infn.it, port 15830 ================== dg-job-submit Success ================ The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobId) is: https://grid009g.cnaf.infn.it:7846/193.206.188.102/104224188091275?grid009g.cnaf.infn.it:7771 The dg_jobId has been saved in the following file: /home/enrica/stress/1962016/run1962016.jobid […] CNAF RB Job ID grid1> ls 19620161962017 Moose.csh Moose.jdl config.csh launch monitor retrieve
The monitor script grid1> more monitor #!/bin/tcsh @ num_f = $1 @ fin = $2 while ( $num_f <= $fin ) echo Run $num_f is `dg-job-status -i \ $num_f/run$num_f.jobid | grep Status` @ num_f++ end EDG command grid1> ./monitor 1962016 1962017 Run 1962016 is Status = OutputReady Status Reason = terminated Run 1962017 is Status = OutputReady Status Reason = terminated grid1> ./monitor 1962016 1962017 Run 1962016 is Status = Scheduled Status Reason = initial Run 1962017 is Status = Scheduled Status Reason = initial grid1> ./monitor 1962016 1962017 Run 1962016 is Status = Running Status Run 1962017 is Status = Running Status grid1> ./monitor 1962016 1962017 Run 1962016 is Status = Ready Status Reason = job accepted Run 1962017 is Status = Ready Status Reason = job accepted
Globus command: Direct copy of file from SE to UI Globus command: delete file fromSE The retrieve script grid1> more retrieve #!/bin/tcsh -v @ num_f = $1 @ fin = $2 while ( $num_f <= $fin ) cd $num_f #### get logfiles dg-job-get-output -i run$num_f.jobid --dir $PWD #### get rootfiles globus-url-copy \ gsiftp://grid2.fe.infn.it/flatfiles/SE00/paolo/run$num_f.tar.gz \ file://`pwd`/run$num_f.tar.gz tar -xzvf run$num_f.tar.gz rm -f run$num_f.tar.gz #### delete rootfiles form SE globus-job-run grid2.fe.infn.it /bin/rm \ /flatfiles/SE00/paolo/run$num_f.tar.gz cd .. @ num_f++ end EDG command
grid1> ls 1962017/ 150551318931039 Moose.jdl Moose.csh config.csh run1962017.jobid rootdef-tru.rootrootdef-tag.rootrootdef-aod.root grid1> ls 1962016/ 150546318633191 Moose.jdl rootdef-tru.root Moose.csh config.csh run1962016.jobid rootdef-tag.rootrootdef-aod.root grid1> ls 1962016/150546318633191/ Moose.log Moose.txt grid1> ls 1962017/150551318931039/ Moose.log Moose.txt Retrieving Output grid1> ls 19620161962017 Moose.csh Moose.jdl config.csh launch monitor retrieve
Integration of Moose Application with EDG software releases RB (UK) RPM RPM MOOSE SE Lock server LCFGng Server UI Management Objectivity DB Data server CE-WN Future Plans Ferrara 3) Install Objy DB on the SE SPGrid Farm 2) MOOSE in RPM format 1) Use of IC RB and others
Documentation • The DataGrid Project: • http://eu-datagrid.web.cern.ch/eu-datagrid/default.htm • EDG tutorials Archive Web Site: • http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/loginex.html • INFN-Grid Testbed: http://server11.infn.it/testbed-grid/ • BaBar-Grid: • http://www.slac.stanford.edu/BFROOT/www/Computing/Offline/BaBarGrid/ • Status of the Farm in Ferrara: http://print.fe.infn.it/status/