210 likes | 238 Views
Explore the grid computing resources and tools utilized by the University of Manchester's Babar team for high-energy physics research. Learn about human resource strategy, job handling, software packages, dataset analysis, and submission processes.
E N D
Enabling Grid Computer for HEP Babar Team at University of Manchester Resources: www.hep.man.ac.uk/u/jamwer jamwer@hep.man.ac.uk
Human resource strategy * Jobs with 5 events instead Millions. jamwer@hep.man.ac.uk
Resources Strategy jamwer@hep.man.ac.uk
Grid Test Bed jamwer@hep.man.ac.uk
Software: 850 packages. Tau Datasets: range between 60 files 1GB and 150 files 1GB Total 4,000 GB ~ 10,000 files jamwer@hep.man.ac.uk
Analysis Submission to Grid (Prototype) • Single command: ./easygrid dataset_name • Perform Handlers management and submission • Software based in State-machine • Verify skimdata available: • If not available perform BbkDatasetTCL to generate skimData. Each file will be a job. • Verify if there are handlers pending • If not, script generation (gera.c) with edg-job-submit and ClassAdds, and script execution. Nest for submission policy and optimisation. • If yes, verify job status. When the all jobs ended, recover results in user folder. jamwer@hep.man.ac.uk
Generation and submission [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ......................................................... Done Creating proxy .................................................... Done Searching pre selected skimdata. Searching previous handlers. Handlers not found. Submiting to GRID . Wait end of process... jamwer@hep.man.ac.uk
Job Status [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ............................ Done Creating proxy ............................... Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg Current Status: Scheduled https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg still pendent. ### Handle -> https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA Current Status: Scheduled https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA still pendent. 4 jobs did not finished ! Try again later. jamwer@hep.man.ac.uk
Job Status and recovery [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy .......................................... Done Creating proxy ........................................................... Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg Current Status: Done Exit code: 0 ### Handle -> https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/babar/jamwer_foRHhWyeDBnbqA9JkDADLg /home/jamwer/grid_sub/babar/jamwer_8DdK3xruxtevNpei3zZbaA jamwer@hep.man.ac.uk
Monte Carlo Submission to Grid (Prototype) • Single Command: ./mcgrid JobName num_copies • Perform Handlers management and submission. • Software based in State-Machine: • Verify if there are handlers pending • If not, script generation (geramc.c) with edg-job-submit and ClassAdds for each copy, and script execution. Nest for submission policy and optimisation. • If yes, verify job status. When the all jobs ended, recover results in user folder. jamwer@hep.man.ac.uk
MC Submission [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ................................. Done Creating proxy ....................................................... Done Searching previous handlers. Handlers not found. Submiting to GRID . Wait end of process... jamwer@hep.man.ac.uk
Job Status [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ........................................ Done Creating proxy ....................................... Done Searching previous handlers. Checking if jobs finished. ### Handle -> https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw Current Status: Scheduled https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw still pendent. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg Current Status: Ready https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg still pendent. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/L5BD1OE--eckTm5RXkp2nA Current Status: Ready https://lcgrb01.gridpp.rl.ac.uk:9000/L5BD1OE--eckTm5RXkp2nA still pendent. 3 jobs did not finished ! Try again later. jamwer@hep.man.ac.uk
Job status and recovery [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy .................................................. Done Creating proxy .................................................... Done Searching previous handlers. Checking if jobs finished. ### Handle -> https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw Current Status: Done Exit code: 0 ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/mcgrid1/jamwer_9WzceoIMEQoTK24a-UvOmw /home/jamwer/grid_sub/mcgrid1/jamwer_c4iCB8vioozaGteI9hybIg /home/jamwer/grid_sub/mcgrid1/jamwer_L5BD1OE--eckTm5RXkp2nA jamwer@hep.man.ac.uk
Testing Submission Script • Load Range: Worker load x #Files • 16 x 60 files = 960 jobs pendent • 16 x 150 files = 2400 jobs pendent • Test with Submission script * sslv3 alert handshake failure ** Please wait job enter the “Done” status. This never happens! Resource Broker not reliable or robust. Sometimes failure 3 days a weekor takes hours to submit/dispatch to CE (empty!). jamwer@hep.man.ac.uk
Pending Infrastructure => Course of action • Babar Software Know How is not available at Manchester => Web Page & Network skills. • Quality Assurance => We are OK! from benchmark (E x P) • Real Application to perform complete cycle, acquire know how, and grid prof-of-concept is missing => Partnership with physicists • CERN does NOT recognise Babar Community => Lets reduce their priority! • RB at Manchester => 60MB binaries and policies freedom. • SE/RC at Manchester => policies and submission jobs freedom. • Mass storage (10TB) for Babar purposes => CAP! • UI in the AFS => wide access to Manchester farms. • Apprenticeship at RAL and later at SLAC – production and experiment => improve where others fail • Configuration for optimal job performance/submission at Tear 2 (1 Ce x 50 WN? Performance dCache with Babar Software? Why 10TB if Liverpool bought 80TB? Electricity bill? => analyse procedures to improveQoS and better Site Configuration • Update (software and data) and operational policies => operational standards to achieve high QoS jamwer@hep.man.ac.uk
Aimed Hardware Architecture (Redundant RB with alternate access) jamwer@hep.man.ac.uk
Aimed Software Architecture jamwer@hep.man.ac.uk
Production Job Submission Package • Operational policies/integration with RB (application level). • Recovery of aborted status. • Resources optimisation. • Integration with RC (application level) for replicas policies development. • Interactive data visualisation (Useful?) • Integration with GridSite (Data visualisation, analysis, performance monitor, and submission) • Professional version. jamwer@hep.man.ac.uk
Summary Integrate LCG2 and Job Submission with Babar/CM2 at University of Manchester for Tau Physics modelling, analysis and MC generation. We aim to be soon… • The largest site in UK. • Leader in grid computing and HEP jamwer@hep.man.ac.uk
Conclusion Babar CM2 is running at Manchester! LCG2 Grid is running with real world experiment! Babar submission prototype to Grid is running ! LCG is not LHC software only! It is Babar’s. We are doing today what will take years to you to achieve. Lets work together! jamwer@hep.man.ac.uk