1 / 32

Physics of LHC Experiments: Utilizing OSG Resources

This presentation explores how LHC experiments will use OSG resources and the role of the Open Science Grid in particle physics research.

godseyc
Download Presentation

Physics of LHC Experiments: Utilizing OSG Resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LHC PhysicsorHow LHC experiments will use OSG resources • OSG Consortium Meeting • 08 / 21 / 06

  2. Particle Physics • Probe innermost structure and explain it from first principles 08/21/06 Oliver Gutsche - LHC Physics

  3. Standard Model • Current description of matter: • 12 elementary particles and their anti-particles • 4 force particles carrying the 3 dominating forces (at scales investigated) 08/21/06 Oliver Gutsche - LHC Physics

  4. LHC, the discovery machine • What can the LHC add to the current picture: • Higgs: explain the origin of mass • Super-Symmetry: beyond the standard model 08/21/06 Oliver Gutsche - LHC Physics

  5. LHC • Large Hadron Collider • Proton-Proton collisions • Beam energy: 7 TeV • Circumference: 27 km • Bunch crossing rate at interaction regions: 40 MHz 08/21/06 Oliver Gutsche - LHC Physics

  6. Outline • Example for Higgs discovery channel • Analysis and requirements • Computing • CMS Metrics • CMS computing model • The role of the T2 centers • OSG contribution • User use case • Services Challenges and current status • Summary & Outlook Apologies to the other 3 experiments, following talk mainly concentrates on CMS 08/21/06 Oliver Gutsche - LHC Physics

  7. Example Signal: H→ZZ(*)→4μ • Signal Reconstruction strategy: • Reconstruct 4 μ (2 μ+, 2 μ-) • combine each 2 μ to Z • combine 2 Z to H 08/21/06 Oliver Gutsche - LHC Physics

  8. Event display 08/21/06 Oliver Gutsche - LHC Physics

  9. Backgrounds and Analysis Strategy • Signal overlaid by background events which produce similar signature in detector • Analysis Strategy: • Record data with detector • Simulatebackground and signal events • Extract signal from recorded data using the background simulation 3 main background tt Zbb ZZ 08/21/06 Oliver Gutsche - LHC Physics

  10. Data taking vs. MC production High Level Trigger Reconstruction primary datasets Data Taking Generation and Simulation Reconstruction MC datasets MC production 08/21/06 Oliver Gutsche - LHC Physics

  11. Data tiers SimFEVT (full simulated event) FEVT (full event) RECO RECO-SimEvent SimEvent RAW AOD AOD 08/21/06 Oliver Gutsche - LHC Physics

  12. Disk requirements • Detector output rate after trigger: 150 Hz for all running conditions (low luminosity, high luminosity) • Extrapolated beam time: • Events from data taking: • Events from MC production: • Total: 08/21/06 Oliver Gutsche - LHC Physics

  13. CPU requirements • Assume Core in 2007: 2000 SI2K (optimistic) • Single Core Pentium IV 3 GHz ≈ 1300 SI2K • Time to reconstruct event • 78 kSI2K s/ev • On demand reconstruction at 150 Hz • 11.7 MSI2K ≈ 5850 Core’s • Time to simulate and reconstruct event • 234 kSI2K s/ev => 1 Core ≈ 270,000 ev/year Reconstruction and MC production Analysis • Single Analysis has to access one primary dataset and the MC samples • Assume: • Analysis needs access to AOD • Every 3 days • Selection has to be finished at least after 3 days • Analysis needs access to RECO • Every 7 days • Analysis has to be finished at least after 7 days • Selection time: 0.25 kSI2K s/ev • Analysis time: 0.25 kSI2K s/ev 08/21/06 Oliver Gutsche - LHC Physics

  14. CMS Grid Tier Structure T2 T2 T2 T2 T2 T2 T1: USA T2 T2 T1: Italy T1: France T2 T2 T0 T2 T1: Germany T1: Spain T2 T2 T2 T1: UK T1: Taiwan T2 T2 T2 T2 T2 T2 T2 T2 08/21/06 Oliver Gutsche - LHC Physics

  15. Data flow • T0 distributes RAW and reconstructed data to T1’s(subset of the primary datasets, full AOD copy) • T2’s are associated to specific T1 which provides support and distributes data(simulated MC is transferred back to associated T1) 7 T1 25 T2 • substantial computing resources are provided by the T1’s and T2’s • CMS-CAF performs latency critical activities like detector problem diagnostic, trigger performance service, derivation of calibration and alignment data 08/21/06 Oliver Gutsche - LHC Physics

  16. US contribution to CMS Tier structure • U.S. contribution to CMS tier structure • T1 at FNAL • 7 associated T2 sites (OSG) Wisconsin MIT T2 FNAL Nebraska T2 T1 T2 T2 Purdue CALTECH T2 San Diego T2 Florida T2 08/21/06 Oliver Gutsche - LHC Physics

  17. CMS Computing Infrastructure • Core Infrastructure • DBS: Data Bookkeeping System • official catalog of available datasets and MC samples • organizes datasets and samples in fileblock containing files • DLS: Data Location Service • catalog of which storage elements contain specific datasets and MC samples • TFC: Trivial File Catalog • files are stored in a CMS specific namespace which is reproduced at the storage elements at every site • Access to files in the CMS namespace is resolved by the CMS applications automatically prepending site specific locations and access protocol using a local site configuration CRAB PhEDEx ProdAgent GRID Middleware DBS DLS Site Resources batch-system TFC CPU Disk 08/21/06 Oliver Gutsche - LHC Physics batch-system

  18. CMS Computing Infrastructure • Computing Systems • PhEDEX • Transport any dataset or samples registered in DBS between sites • MC production system (ProdAgent) • Produce MC samples in centralized way • User Analysis Tool (CRAB) • enable user to execute his analysis code on any sample registered in DBS/DLS CRAB PhEDEx ProdAgent GRID Middleware DBS DLS Site Resources batch-system TFC CPU Disk 08/21/06 Oliver Gutsche - LHC Physics batch-system

  19. Site requirements • CMS T2 site • CPU: 1 MSI2K • DISK: 200 TB • Network: 2.5-10 Gbps • OSG middleware stack • Computing Element (CE) • Storage Element (SE) • Batch system • OSG sites: Condor and PBS • Mass Storage system • OSG sites: dCache (tapeless) • CMS software installed by central instance (OSG sites: $OSG_APP) including TFC setup • Opportunistic usage (OSG) • Useable for MC production • OSG middleware stack • Computing Element (CE) • Outbound connectivityof workernodes (stage-in and out) • CMS software installed by central instance (in $OSG_APP) 08/21/06 Oliver Gutsche - LHC Physics

  20. User analysis • Task • Discovery of the Higgs in channel • H→ZZ(*)→4μ • Approach: • use recorded and reconstructed data from detector • use produced MC samples • reconstruct 4μ signature and extract Higgs mass OLI 08/21/06 Oliver Gutsche - LHC Physics

  21. ideal MC workflow • MC samples: • Produced by MC production system at T2 level • Archived at T1 which stores corresponding datasets • MC production requests • Initiated by the physics groups • Samples are probably more general than needed • Physics groups and users • Prepare skimsof the MC samples at T1’s for specific physics purposes • Complete skims are transported to dedicated T2’s. • Useranalyzes MC samples by processing skims at T2’s 08/21/06 Oliver Gutsche - LHC Physics

  22. ideal Data workflow • Data • Recorded by the detector • Triggered by the HLT • Reconstructed at T0 • Split into primary datasets • Distributed between T1 centers • Most analysis only needs to access one of the primary samples • AOD is sufficient for 90% of the analysis • Physics groups and users • SkimAOD of primary samples at T1’s • Complete skims are transported to dedicated T2’s. • Useranalyzes data samples by processing AOD skims at T2’s data taking hasn’t started, for now just a event display of a cosmic muon 08/21/06 Oliver Gutsche - LHC Physics

  23. Special tasks • Alignment/Calibration • In general, alignment/calibration is calculated at T0 or T1 for immediate usage during reconstruction • Parts of RAW data samples can in addition be transported to dedicated T2 to calculate improved alignment/calibration constants • Reprocessing of data • Data samples are planned to be re-reconstructed with improved alignment/calibration and reconstruction algorithms 3 times a year on T0 and T1 level • Reprocessing requires re-skimming of samples Alignment / Calibration Reprocessing 08/21/06 Oliver Gutsche - LHC Physics

  24. Startup scenario • During startup of LHC, no experience with the detector and its output will exist • First calibration/alignment calculations will be not sufficient for analysis • RAW data samples will be transported to T2’s to: • Understand the detector • Improve the reconstruction algorithms, calibration and alignment • Extractfirst physics messages • T1 resources • Used more for analysis than skimming and re-reconstruction (less resources needed by default operations). • Re-reprocessing will be more frequent but short running time hasn’t produced large amounts of data yet AOD RAW 08/21/06 Oliver Gutsche - LHC Physics

  25. Service Challenges • To prepare for the startup, the experiments exercise their systems in dedicated service challenges • CMS currently runs “Service Challenge 4” (SC4) which consists of the exercise of: • the dataset transport system PhEDEx with emphasis on all possible transport endpoint combinations (T0, T1, T2) • the MC production system by running ~12,500 MC production jobs per day on LCG and OSG resources and to produce MC samples for the upcoming challenge • the analysis infrastructure by running ~12,500 analysis jobs per day on samples distributed to every center • CMS’ next challenge is CSA2006 in the fall which exercises the full workflow • feeding MC samples to the HLT and reconstruction at T0 centers • Reprocessing, skimming at T1 centers • Analysis and MC simulation at T2 centers • Sample transportation between the individual centers 08/21/06 Oliver Gutsche - LHC Physics

  26. First MC production run: OSG summary • First MC production run: ~2 weeks of August • Total: 45 Mio. events, OSG (incl. FNAL): 17 Mio. events • More details in Ajit Mohapatra’s talk on Wednesday 08/21/06 Oliver Gutsche - LHC Physics

  27. Analysis infrastructure: grid-wide performance • Automated systems (JobRobots) achieve grid-wide performance close to goal of 12,5000 jobs per day • scale at ~10,000 jobs per day for 5 days and improving Overall Job statistics Job Success Statistics GRID Success Statistics 08/21/06 Oliver Gutsche - LHC Physics

  28. Analysis infrastructure: OSG-wide performance • Scale at ~3,000 jobs per day for 5 days • OSG sites contribute large percentage of overall analysis performance Overall Job statistics Job Success Statistics GRID Success Statistics 08/21/06 Oliver Gutsche - LHC Physics

  29. Analysis infrastructure: OSG site performance • Example: Nebraska: • scale between 450 and 850 jobs per day • high success rates • Other OSG sites show similar performance Overall Job statistics Job Success Statistics GRID Success Statistics 08/21/06 Oliver Gutsche - LHC Physics

  30. Summary & Outlook • Computing for the LHC experiments is large and globally organized • LHC computing marks the final change in high energy physics from the host laboratory centered analysis of data to a global approach • GRID resources and OSG resources in particular will be used efficiently to perform standard production and analysis as well as special tasks • Analysis will be a challenge, not only from the physics point of view but also from the computational requirements and specialties 08/21/06 Oliver Gutsche - LHC Physics

  31. In the end ... • We hope to make many discoveries in particle physics with LHC data using OSG resources 08/21/06 Oliver Gutsche - LHC Physics

  32. The end 08/21/06 Oliver Gutsche - LHC Physics

More Related