1 / 56

A Proposal for the MEG Offline Systems

A Proposal for the MEG Offline Systems. Corrado Gatto. Lecce 1/20/2004. Outlook. Choice of the framework: ROOT Offline Requirements for MEG ROOT I/O Systems: is it suitable? LAN/WAN file support Tape Storage Support Parallel Code Execution Support Architecture of the offline systems

nate
Download Presentation

A Proposal for the MEG Offline Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Proposal for the MEG Offline Systems Corrado Gatto Lecce 1/20/2004

  2. Outlook • Choice of the framework: ROOT • Offline Requirements for MEG • ROOT I/O Systems: is it suitable? • LAN/WAN file support • Tape Storage Support • Parallel Code Execution Support • Architecture of the offline systems • Computing Model • General Architecture • Database Model • Montecarlo • Offline Organization • Already Discussed (phone meeting Jan 9th, 2004) • Why ROOT? Offline Sotfware Framework Services Experiment Independent

  3. Coming Soon… • Next Meeting (PSI Feb 9th, 2004) • Data Model (Object Database vs ROOT+RDBMS) • More on ROOT I/O • UI and GUI • GEANT3 compatibility

  4. Dataflow and Reconstruction Requirements • 100 Hz L3 trigger • evt size : 1.2 MB • Raw Data throughput: (10+10)Hz ´1.2Mb/Phys evt ´ 0.1 + 80Hz ´ 0.01 Mb/bkg evt = 3.5 Mb/s • <evt size> : 35 kB • Total raw data storage: 3.5Mb/s ´107s = 35 TB/yr

  5. Framework Implementation Constraints • Geant3 compatible (at least at the beginning) • Written and maintained by few people • Low level of concurrent access to reco data • Scalability (because of the uncertainty of the event rate)

  6. Compare to BABAR

  7. BaBar Offline Systems >400 nodes (+320 in Pd) >20 physicists/engineers

  8. Experiments Using ROOT for the Offline

  9. Incidentally…… • All the above experiments use the ROOT framework and I/O system • BABAR former Offline Coordinator now at STAR (T. Wenaus) moved to ROOT • BABAR I/O is switching from Objy to ROOT • Adopted by Alice Online+Offline with the most demanding requirements regarding raw data processing/storage. • 1.25 GBytes/s • 2 PBytes/yr • Large number of experiments (>30) using ROOT world-wide ensures open-source style support

  10. Requirements for HEP software architecture or framework • Easy interface with existing packages: • Geant3 , Geant4, Event generators • Simple structure to be used by non-computing experts • Portability • Experiment-wide framework • Use a world-wide accepted framework, if possible Collaboration-specific framework is less likely to survive in the long term

  11. ROOT I/O Benchmark • Phobos: 30 MB/s - 9 TB (2001) • Event size: 300 kB • Event rate: 100 Hz • NA57 MDC1: 14 MB/s - 7 TB (1999) • Event size: 500kB • Event rate: 28 Hz • Alice MDC2: 100 MB/s - 23 TB (2000) • Event size: 72 Mbytes • CDF II: 20 MB/s - 200 TB (2003) • Event size: 400kB • Event Rate: 75 Hz

  12. ROOT I/O Performance Check #1: Phobos: 30 MB/s-9 TB (2001) • Event size: 300 kB • Event rate: 100 Hz

  13. Real Detector • ROOT I/O used between Evt builder (ROOT code) and HPSS • Raid (2 disks * 2 SCSI ports only used to balance CPU load) • 30 MB/sec data transfer not limited by ROOT streamer (CPU limited) • With additional disk arrays estimated throughput > 80 MB/sec • Farm: 10 nodes running Linux rootd File I/O Benchmark • 2300 evt read in a loop (file access only, no reco, no selection) • Results: 17 evt/sec • rootd(aemon) transfers data to actual node with 4.5 MB/sec • Inefficient design: ideal situation for PROOF

  14. Raw Performances (Alice MDC2) Pure Linux setup 20 data sources FastEthernet local connection

  15. Experiences with Root I/O • Many experiments are using Root I/O today or planning to use it in the near future: • RHIC (started last summer) • STAR 100 TB/year + MySQL • PHENIX 100 TB/year + Objy • PHOBOS 50 TB/year + Oracle • BRAHMS 30 TB/year • JLAB (starting this year) • Hall A,B,C, CLASS >100 TB/year • FNAL (starting this year) • CDF 200 TB/year + Oracle • MINOS

  16. Experiences with Root I/O • DESY • H1 moving from BOS to Root for DSTs and microDSTs 30 TB/year DSTs + Oracle • HERA-b extensive use of Root + RDBMS • HERMES moving to Root • TESLA Test beam facility have decided for Root, expect many TB/year • GSI • HADES Root everywhere + Oracle • PISA • VIRGO > 100 TB/year in 2002 (under discussion)

  17. Experiences with Root I/O • SLAC • BABAR >5 TB microDSTs, upgrades under way + Objy • CERN • NA49 > 1 TB microDSTs + MySQL • ALICE MDC1 7 TB, MDC2 23 TB + MySQL • ALICE MDC3 83 TB in 10 days 120 MB/s (DAQ->CASTOR) • NA6i starting • AMS Root + Oracle • ATLAS, CMS test beams • ATLAS,LHCb, Opera have chosen Root I/O against Objy • + several thousand people using Root like PAW

  18. LAN/WAN files • Files and Directories • a directory holds a list of named objects • a file may have a hierarchy of directories (a la Unix) • ROOT files are machine independent • built-in compression • Support for local, LAN and WAN files • TFile f1("myfile.root") • TFile f2("http://pcbrun.cern.ch/Renefile.root") • TFile f3("root://cdfsga.fnal.gov/bigfile.root") • TFile f4("rfio://alice/run678.root") Local file Remote file access via a Web server Remote file access via the ROOT daemon Access to a file on a mass store hpps, castor, via RFIO

  19. Support for HSM Systems • Two popular HSM systems are supported: • CASTOR • developed by CERN, file access via RFIO API and remote rfiod • dCache • developed by DESY, files access via dCache API and remote dcached TFile *rf = TFile::Open(“rfio://castor.cern.ch/alice/aap.root”) TFile *df = TFile::Open(“dcache://main.desy.de/h1/run2001.root”)

  20. Parallel ROOT Facility • Data Access Strategies • Each slave get assigned, as much as possible, packets representing data in local files • If no (more) local data, get remote data via rootd and rfio (needs good LAN, like GB eth) • The PROOF system allows: • parallel analysis of trees in a set of files • parallel analysis of objects in a set of files • parallel execution of scripts on clusters of heterogeneous machines

  21. stdout/obj proof ana.C proof TFile proof TNetFile proof proof proof = master server proof = slave server Parallel Script Execution #proof.conf slave node1 slave node2 slave node3 slave node4 Local PC Remote PROOF Cluster root *.root node1 ana.C *.root $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) root [2] gProof->Exec(“.x ana.C”) $ root $ root root [0] .x ana.C node2 *.root node3 *.root node4

  22. A Proposal for the MEG Offline Architecture Computing Model General Architecture Database Model Montecarlo Offline Organization Corrado Gatto PSI 9/2/2004

  23. Computing Model: Organization • Based on a distributed computing scheme with a hierarchical architecture of sites • Necessary when software resources (like the software groups working on the subdetector code) are deployed over several geographic regions and need to share common data (like the calibration). • Also important when a large MC production involves several sites. • The hierarchy of sites is established according to the computing resources and services the site provides.

  24. Computing Model: MONARC • A central site,Tier-0 • will be hosted by PSI. • Regional centers, Tier-1 • will serve a large geographic region or a country. • Might provide a mass-storage facility, all the GRID services, and an adequate quantity of personnel to exploit the resources and assist users. • Tier-2 centers • Will serve part of a geographic region, i.e., typically about 50 active users. • Are the lowest level to be accessible by the whole Collaboration. • These centers will provide important CPU resources but limited personnel. • They will be backed by one or several Tier-1 centers for the mass storage. • In the case of small collaborations, Tier-1 and Tier-2 centers could be the same. • Tier-3 Centers • Correspond to the computing facilities available at different Institutes. • Conceived as relatively small structures connected to a reference Tier-2 center. • Tier-4 centers • Personal desktops are identified as Tier-4 centers

  25. Data Processing Flow • STEP 1 (Tier-0) • Prompt Calibration of Raw data (almost Real Time) • Event Reconstruction of Raw data (within hours of PC) • Enumeration of the reconstructed objects • Production of three kinds of objects per each event: • ESD (Event Summary Data) • AOD (Analysis Object Data) • Tag objects • Update of the database of calibration data, (calibration constants, monitoring data and calibration runs for all the MEG sub-detectors). • Update the Run Catalogue • Post the data for Tier-1 access

  26. Data Processing Flow • STEP 2 (Tier-1) • Some reconstruction (probably not needed at MEG) • Eventual reprocessing • Mirror locally the reconstructed objects. • Provide a complete set of information on the production (run #, tape #, filenames) and on the reconstruction process (calibration constants, version of reconstruction program, quality assessment, and so on). • Montecarlo production • Update the Run catalogue.

  27. Data Processing Flow • STEP 3 (Tier-2) • Montecarlo production • Creation of DPD (Derived Physics Data) objects. • DPD objects will contain information specifically needed for a particular analysis. • DPD objects are stored locally or remotely and might be made available to the collaboration.

  28. Data Model • ESD (Event Summary Data) • contain the reconstructed tracks (for example, track pt, particle Id, pseudorapidity and phi, and the like), the covariance matrix of the tacks, the list of track segments making a track etc… • AOD (Analysis Object Data) • contain information on the event that will facilitate the analysis (for example, centrality, multiplicity, number of electron/positrons, number of high pt particles, and the like). • Tag objects • identify the event by its physics signature (for example, a Higgs electromagnetic decay and the like) and is much smaller than the other objects. Tag data would likely be stored into a database and be used as the source for the event selection. • DPD (Derived Physics Data) • are constructed from the physics analysis of AOD and Tag objects. • They will be specific to the selected type of physics analysis (ex: mu->e gamma, mu->e e e) • Typically consist of histograms or ntuple-like objects. • These objects will in general be stored locally on the workstation performing the analysis, thus not add any constraint to the overall data-storage resources

  29. Building a Modular System Use ROOT’s Folders

  30. Folders • A Folder can contain: • other Folders • an Object or multiple Objects. • a collection or multiple collections.

  31. Folders Types • Tasks • Data Folders Interoperate • Data Folders are filled by Tasks (producers) • Data Folders are used by Tasks (consumers)

  32. Folders Type: Tasks Reconstructioner Reconstructioner (per detector) 1…3 Digitizer {User code} Digitizer (per detector) Clusterizer (per detector) Clusterizer {User code} DQM 1…3 Fast Reconstructioner (per detector) Analizer Analizer Alarmer

  33. Calibrator Aligner 1…3 1…3 Calibrator (per detector) Aligner (per detector) Histogrammer (per detector) Histogrammer (per detector) Folders Type: Tasks

  34. Vertexer Trigger 1…3 1…3 Vertexer (per detector) Trigger (per detector) Histogrammer (per detector) Histogrammer (per detector) Folders Type: Tasks

  35. Main.root DCH.Digits.root DCH.Hits.root Kine.root Event #1 Event #1 Event #1 Header Event #2 Event #2 Event #2 TreeeD TreeeH TreeeH TreeK TreeH TreeK TreeD EMC Hits Digi Track Particles Kinematics Particles Data Folder Structure Constants Run Header Conditions Configur. Event(i) Raw Data 1…n DC Hits TOF Hits Reco Data DC EMC TOF MC only Track Ref.

  36. 1 common TFile + 1 TFile per detector +1 TTree per event main.root TTree0,…i,…,n : kinematics TClonesArray TParticles MEG: Run Info DCH.Hits.root TTree0,…i,…,n : hits TBranch : DCH TClonesArray EMC.Hits.root TreeH0,…i,…,n : hits TBranch : EMC TClonesArray Files Structure of « raw » Data

  37. Detector wise splitting DCH.Digits.root SDigits.root Each task generates one TFile • 1 event per TTree • task versioning in TBranch EMC.Digits.root TTree0,…i,…n TBranchv1,…vn DCH.Reco.root Reco.root EMC.Reco.root TClonesArray Files Structure of « reco » Data

  38. Run-time Data-Exchange • Post transient data to a white board • Structure the whiteboard according to detector sub-structure & tasks results • Each detector is responsible for posting its data • Tasks access data from the white board • Detectors cooperate through the white board

  39. Class 1 Class 2 Class 8 Class 3 Class 7 Class 4 Class 6 Class 5 Whiteboard Data Communication

  40. Coordinating Tasks & Data • Detector stand alone (Detector Objects) • Each detector executes a list of detector actions/tasks • On demand actions are possible but not the default • Detector level trigger, simulation and reconstruction are implemented as clients of the detector classes • Detectors collaborate (Global Objects) • One or more Global objects execute a list of actions involving objects from several detectors • The Run Manager • executes the detector objects in the order of the list • Global trigger, simulation and reconstruction are special services controlled by the Run Manager class • The Offline configuration is built at run time by executing a ROOT macro

  41. The Run manager executes the detector objects in the order of the list DCH Detector Class TOF EMC Detector tasks Detector Class Detector Class Detector tasks Detector tasks Global Offline Structure Global Reco Run Manager MC Run Manager One or more Global Objects execute a list of tasks involving objects from several detectors Run Class Detector Class Run Class Detector tasks Each detector executes a list of detector tasks ROOT Data Base Tree Branches On demand actions are possible but not the default

  42. Detector Level Structure List of detectors DCH Detector Class Hits Branches of a Root Tree DCH Simulation DetectorTask Class Digits TrigInfo DCH Digitization Local tracks DetectorTask Class DCH Reconstruction DCH Trigger List of detector tasks DetectorTask Class DetectorTask Class

  43. AliTPC AliDetector DCH Detector tasks Detector actions Detector Class AliTOF Module Class DCHGeometry AliDetector Geometry Class -CreateGeometry -BuildGeometry -CreateMaterials Detector actions AliTRD AliFMD AliDetector AliDetector The Detector Class • Base class for MEG subdetectors modules. • Both sensitive modules (detectors) and non-sensitive ones are described by this base class. This class • supports the hit and digit trees produced by the simulation • supports the the objects produced by the reconstruction. • This class is also responsible for building the geometry of the detectors.

  44. MEG Montecarlo Organization The Virtual Montecarlo Geant3/Geant4 Interface Generator Interface

  45. The Virtual MC Concept • Virtual MC provides a virtual interface to Monte Carlo • It enables the user to build a virtual Monte Carlo application independent of any actual underlying Monte Carlo implementation itself • The concrete Monte Carlo (Geant3, Geant4, Fluka) is selected at run time • Ideal when switching from a fast to a full simulation: VMC allows to run different simulation Monte Carlo from the same user code

  46. G3 G3 transport User Code VMC G4 G4 transport FLUKA transport FLUKA Reconstruction Virtual Geometrical Modeller Geometrical Modeller Visualisation Generators The Virtual MC Concept

  47. Run Control Root particle stack hits structures Simplified Root geometry Geometry Database Running a Virtual Montecarlo Transport engine selected at run time Fast MC Generators FLUKA Geant4 Geant3.21 Virtual MC Root Output File

  48. Generator Interface • TGeneratoris an abstract base class, that defines the interface of ROOT and the various event generators (thanks to inheritance) • Provide user with • Easy and coherent way to study variety of physics signals • Testing tools • Background studies • Possibility to study • Full events (event by event) • Single processes • Mixture of both (“Cocktail events”)

  49. Data Access: ROOT + RDBMS Model ROOT files Oracle MySQL Calibrations Event Store histograms Run/File Catalog Trees Geometries

  50. Offline Organization • No chance to have enough people at one site to develop the majority of the code of MEG. • Planning based on maximum decentralisation. • All detector specific software developed at outside institutes • Few people directly involved today • Off-line team responsible for • central coordination, software distribution and framework development • prototyping

More Related