220 likes | 232 Views
Learn about the computing model used by LHCb experiment to study CP-violation and dominance of matter over antimatter using the b-quark. This model involves various applications, algorithms, and frameworks, and relies on distributed computing resources.
E N D
The LHCb Computing ModelPhilippe Charpentier, CERNICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006 0110100111011010100010101010110100 B00le
LHCb in brief • Experiment dedicated to studying CP-violation • Responsible for the dominance of matter on antimatter • Matter-antimatter difference studied using the b-quark (beauty) • High precision physics (tiny difference…) • Single arm spectrometer • Looks like a fixed-target experiment • Smallest of the 4 big LHC experiments • ~500 physicists • Nevertheless, computing is also a challenge…. LHCb Computing Model, PhC
DST Event model / Physics event model Conditions Database Trigger Moore Simul. Gauss Analysis DaVinci Recons. Brunel GenParts Digit. Boole MCParts Raw Data AOD MCHits (r)DST Gaudi LHCb data processing software LHCb Computing Model, PhC
Gauss Boole Brunel DaVinci Panoramix Moore Applications Component projects Lbcom Rec Phys Online LHCb Event Model Gaudi Framework POOL SEAL COOL Geant4 LCG Root Ext.Libs CORAL GENSER LHCb software stack • Uses CMT for build and configuration (handling dependencies) • LHCb projects: • Applications • Gauss (simulation), Boole (digitisation), Brunel (reconstruction), Moore (HLT), DaVinci (analysis) • Algorithms • LBCOM (commone packages), Rec (reconstruction), Phys (physics), Online • Event model • LHCb • Software framework • Gaudi • LCG Applications area • POOL, root, COOL • Lcg/external • External SW: boost, xerces… also middleware client (lfc, gfal,…) LHCb Computing Model, PhC
LHCb Basic Computing principles • Raw data shipped in real time to Tier-0 • Registered in the Grid (File Catalog) • Raw data provenance in a Bookkeeping database (query-enabled) • Resilience enforced by a second copy at Tier-1’s • Rate: ~2000 evts/s (35 kB) 70 MB/s • 4 main trigger sources (with little overlap) • b-exclusive; dimuon; D*; b-inclusive • All data processing up to final Tuple or histogram production distributed • Not even possible to reconstruct all data at Tier0… • Part of the analysis is not data-related • Extracting physics parameters on CP violation (toy-MC, complex fitting procedures…) • Also using distributed computing resources LHCb Computing Model, PhC
Basic principles (cont’d) • LHCb runs jobs where data are • All data are placed explicitly • Analysis made possible by reduction of datasets • many different channels of interest • very few events in each channel (from 102 to 106 events / year) • physicist dealing with maximum 107 events • small and simple events • final dataset manageable on physicist’s desktop (100’s of GBytes) • Calibration and alignment performed on a selected part of the data stream • Alignment and tracking calibration using dimuons (~200/s) • PID calibration using D* (~100/s) LHCb Computing Model, PhC
Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Tier1 Tier1 Tier1 Tier1 Tier1 MSS-SE LHCb dataflow Online Tier0 Tier2 Raw MSS-SE Tier1 Digi Recons. Raw/Digi rDST Analysis Stripping rDST+Raw DST DST LHCb Computing Model, PhC
Comments on the LHCb Distributed Computing • Only last part of the analysis is foreseen to be “interactive” • Either analysing ROOT trees or using GaudiPython/pyRoot • User analysis at Tier1’s - why? • Analysis is very delicate, needs careful file placement • Tier1’s are easier to check, less prone (in principle) to outages • CPU requirements are very modest • What is LHCb’s concept of the Grid? • It is a set of computing resources working in a collaborative way • Provides computing resources for the collaboration as a whole • Recognition of contributions is independent on what type of jobs are run at a site • There are no noble and less noble tasks. All are needed to make the experiment a success • Resources are not made available for nationals • Resource high availability is the key issue LHCb Computing Model, PhC
EGEE How to best achieve Distributed Computing? • Data Management is primordial • It was almost completely absent from EDG R&D • R&D took place but didn’t deliver anything usable • (Too) few resources are allocated in EGEE • Successful packages were developed in close collaboration with VOs • LFC, FTS: very close contacts with users • SRM v2.2 specification: done after experiments’ request and with their participation • Infrastructure is vital • Resource management • 24x7 support coverage • Reliable and powerful networks (OPN) • Resource sharing is a must • Less support needed • Best resource usage (less idle CPUs, empty tapes, unused networks…) • …. but opportunistic resources should not be neglected… LHCb Computing Model, PhC
How to best achieve Distributed Computing (cont’d) • Workload Management • This received most development effort (EDG, EGEE) • Developments were not (are still not) done in so close collaboration with users • Experiments participate in TCG meetings, but their experience is not enough taken into account • Experiments had to develop their own solutions to implement what they needed • A bit of history…. • 2000-2004: EDG (R&D) • 2004: LCG ARDA RTAG - generated great hopes…. • 2004- EGEE WMS re-engineering - still not fully exposed to experiments and not at the expected level (although more stable) • Analysis tasks requires a 99% efficiency • In parallel, experiments developed their solutions to cope with these inefficiencies: AliEn, DIRAC • They also allow them to deal with heterogeneous Grids • … and take advantage of opportunistic resources LHCb Computing Model, PhC
LHCb Distributed Computing software • Integrated WMS and DMS : DIRAC • Presentations by Andrei and Andrew on Sunday • Distributed analysis portal: GANGA • Presentation by Ulrik on Friday • Uses DIRAC W&DMS as back-end • Main characteristics • Implements late job scheduling • Overlay network (pilot agents, central task queue) • Allows LHCb policy to be enforced • Alleviates the level of support required from sites • LHCb services designed to be redundant and hence highly available (multiple instances with failover, VO-BOXes) LHCb Computing Model, PhC
The LHCb Tier1s • 6 Tier1s • CNAF (IT, Bologna) • GridKa (DE, Karlsruhe) • IN2P3 (FR, Lyon) • NIKHEF (NL, Amsterdam) • PIC (ES, Barcelona) • RAL (UK, Didcot) • Contribute o • Reconstruction • Stripping • Analysis • Keeps copies on MSS of • Raw (2 copies shared) • Locally produced rDST • DST (2 copies) • MC data (2 copies) • Keeps copies on disk of • DST (7 copies) LHCb Computing Model, PhC
LHCb Computing: a few numbers • Event sizes • on persistent medium (not in memory) • Processing time • Best estimates as of today • Requirements for 2008 • 4 106 seconds of beam LHCb Computing Model, PhC
Reconstruction requirements • 2 passes per year: • 1 quasi real time over ~100 day period (2.8 MSI2k) • re-processing over 2 month period of shutdown (4.3 MSI2k) • Make use of Filter Farm at pit (2.2 MSI2k) - data back to the pit LHCb Computing Model, PhC
Stripping requirements • Stripping 4 times per year - 1 month production outside of recons • Stripping has at least 4 output streams • Only rDST stored for “non-b” channels+RAW i.e. 55 kB • RAW+full DST for “b” channels - i.e. 110kB • Output on disk SE at all Tier-1 centres LHCb Computing Model, PhC
Simulation requirements • studies to measure performance of detector & event selection in particular regions of phase space • use large statistics dimuon & D* samples for systematics - reduced Monte Carlo needs LHCb Computing Model, PhC
Simulation storage requirements • Simulation still dominate LHCb CPU needs • Current evt size for Monte Carlo DST (with truth info) is ~400kB/evt; • Total storage needs 64TB in 2008 • Output at CERN and another 2 copies distributed over Tier-1 centres LHCb Computing Model, PhC
Analysis requirements • user analysis accounted in model predominantly batch - ~30k jobs/year • predominantly analysing ~106 events • CPU of 0.3 kSI2k.s/evt • Analysis needs grow linearly with year in early phase of expt LHCb Computing Model, PhC
Summary (incl. efficiencies) for 2008 LHCb Computing Model, PhC
Summary & evolution of requirements LHCb Computing Model, PhC
Conclusions • LHCb has proposed a Computing Model adapted at its specific needs (number of events, event size, low number of physics candidates) • Reconstruction, stripping and analysis resources located at Tier1s (and possibly some Tier2s with enough storage and CPU capacities) • CPU requirements dominated by Monte-Carlo, assigned to Tier2s and opportunistic sites • With DIRAC, even idle desktops / laptops could be used ;-) • LHCb@home ? • Requirements are modest compared to other experiments • DIRAC is well suited and adapted to this computing model • Integrated WMS and DMS • GANGA is being more and more used for submitting user analysis to the Grid • LHCb’s Computing should be ready when first data come LHCb Computing Model, PhC
16 October 21:32 CET Hot news! Stop press! 16 October 22:29 CET Test jobs running successfully at NIPNE! LHCb Computing Model, PhC