180 likes | 196 Views
ATLAS Analysis Model. Introduction. On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo , I. Hinchliffe , S. Menke , ATL-GEN-INT-2008-001) describing the analysis model needed. This report draws some guidelines on the way analysis should be done in Atlas.
E N D
Introduction • On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL-GEN-INT-2008-001) describing the analysis model needed. • This report draws some guidelines on the way analysis should be done in Atlas. • Although some of the things might not go as planned, I think it is very helpful to see what is the idea behind all the tools. • Comments during this lecture are very welcome, since this is an evolving subject • Heavy Ion needs were not addressed in this report. Their need are much different from the proton collision needs.
EDM • This paragraph emphasizes the need for event data model:
Data Structure • RDO - Raw Data Object • Content - full information of the detector response. • Size – should be ~2MB/evt • ESD - Event Summary Data • Content - The detailed output of the detector reconstruction. • Derivation – from RDO. • Purpose - should have sufficient information for particle identification and track re-fitting. • Size - should be ~500 kB/evt for real data. The current data size is ~20% larger. • Format – pool file • AOD - Analysis Object Data – • Content - summary of all the reconstructed objects. • Derivation – from ESD • Purpose - provide sufficient information for common analyses. • Size - should be ~100kB/evt for real data, however it is now ~200kB/evt where most of the data is trigger information. MC truth should take ~60kB/evt, so the truth information in the AOD is not full (reduction according to ATL-SOFT-INT-2007-002). • Format – pool file
Data Structure • DPD (Derived Physics Data) – • D1PD (primary DPD) • Content – different content for different communities , defined by the relevant community. • Derivation - from AOD (sometimes from ESD) • Size – should be small enough to copy them to Tier-3 or off-grid disks. ~10kB/evt • Format – pool file • D2PD (secondary DPD) • Content – specific for a certain analysis (defined by the relevant group). Derived information can be added • Derivation – from D1PD and AOD • Format – pool file • D3PD (tertiary DPD) • Content – should contain all the information need to produce the final plots for publication • Format – hbook/ntuple/pool file/other • Tags • Content – predefined fields for quick event identification • Size – should be ~1kB/evt • Format – database or ROOT files
Terms • Skimming – Removal of events • Thinning – Removal of containers • Slimming – Removal of object from a container
Computing Model TAGs RDO AOD Latex BS DAQ + Trigger Reconstruction DPD ESD AOD Reconstruction Common Analysis AOD TAGs
Frameworks • Athena – analysis inside Athena. The analysis is done by writing algorithms and tools using all the Athena framework • Intermediate framework (“event view”) – collection of common tools to create DPDs. • ARA – provides c++ and python code to convert persistent data into transient data. It does not include the Athena services, so analyses that need database services (like geometry) can’t be done in ARA (for example analyses that involve calorimeter cells and the full information of vertices and tracking)
Recommendations in the report • Official analyses must be done using validated tools only!So work with Athena tools as much as you can. And add your private tools to Athena • Many recommendations were made. For completeness I copied all of them to here, but I will talk only on few of them.
Recommendations in the report Only D3PD can be ntuple • Storage format of DPDs Use official tools for analysis.Put your tools in public place
Recommendations in the report • Distribution of and access to DnPDs • ARA CINT is not recommended Python – two times faster than CINT Compiled C++ - two times faster than python
Recommendations in the report • Code distribution and software infrastructure • Event Data Model
Recommendations in the report • EDM Back on the envelope calculation: reading time of 1M events~15min only for reading the info. Ntuples are ten times faster than that
Recommendations in the report • Primary DPD content • Priorities and coordination of Primary DPD production
Recommendations in the report • Primary DPD production
Recommendations in the report • Toolkits or analysis frameworks For my understanding that means that you cannot build primary DPDs with eventView – but I’m not sure I understand it correctly
Recommendations in the report • EventView