180 likes | 287 Views
Dataset Based Physics Analysis. Elizabeth Gallas Oxford University From TOB Task Forces to Final Dress Rehearsal 11 December 2006. Outline. Start from the End Luminosity and cross section Storing Luminosity Streaming Test (Ayana) Online System Assumptions: inclusive streaming
E N D
Dataset Based Physics Analysis Elizabeth Gallas Oxford University From TOB Task Forces to Final Dress Rehearsal 11 December 2006 Elizabeth Gallas
Outline • Start from the End • Luminosity and cross section • Storing Luminosity • Streaming Test (Ayana) • Online System • Assumptions: • inclusive streaming • What happens at Pt 1/Tier0 • What happens at Tier 1/beyond (Jack,Marjorie) • Comments about Exclusive Streaming • Summary • Conclusion Elizabeth Gallas
Who needs luminosity (normalization) ? • We all agree: • For ‘physics’ triggers/analysis, • Some to greater precision than others • Think about ‘other’ cases • Calibration – not written to ‘physics’ stream !?! • Might be Luminosity dependent or • worse: BCID dependent (Beam Crossing) • Calibration – written to ‘physics’ stream ? • Create robust system enabling luminosity normalization for many trigger configurations • Must come with ‘Operations Rules’ to insure the necessary booking is recorded • Robustness should include ability to recover a lower precision luminosity should losses occur Elizabeth Gallas
Calculating a Physics Cross Section Elizabeth Gallas
Physics and Trigger Cross Sections Where f includes • Trigger efficiency • Reconstruction efficiency Assumes: • physics dataset is composed of a all events recorded satisfying a trigger in a well defined set of Luminosity Blocks (LBNs) • We can measure the trigger cross section for each Luminosity Block Elizabeth Gallas
Measuring a Trigger Cross Section Elizabeth Gallas
Simple Luminosity DB – 2 Tables Run_LBN Table Run_LBN_Trigger Table Run Number LBN Trigger Name L1_EVENTS L2_EVENTS L3_EVENTS L1_PRESCALE L2_PRESCALE L3_PRESCALE Run Number LBN Start Time End Time Duration (seconds) Luminosity (nb-1) Live Fraction Quality Elizabeth Gallas
For the Streaming Test • This is a simple Luminosity Database being implemented for the Streaming Test • For each Run_Number,LBN,Trigger we can calculate: • Trigger_Luminosity = DELIVERED_LUM * LIVE_FRACTION / PRESCALE • Trigger_Cross_Section = L3_EVENTS / Trigger_Luminosity Elizabeth Gallas
Lum DB–StreamTest Data Sources • Columns populated from Ayana's log files: • RUN_LBNS.RUN_NUMBER, LBN • RUN_LBN_TRIGGERS.TRIGGER_NAME,L3_ACCEPTS. • Other database columns • Start, End_times, prescales, luminosity ... • populated with 'fake' data based on logical assumptions about what we expect the data to look like, will evolve with the Test. • Placeholders for • Deadtime, Prescales, Level 1,2 Accepts, Data Quality, Duration Elizabeth Gallas
Lum Database – Real Data Sources • Online Quantities • Run, LBN, Start, End_time - RC(Run Control) • Live Fraction – Level 1 • EVENTS accepted – HLT or EventLossMonitor • Trigger Configuration • Prescales at Level 1, 2, and 3, Trigger Names • Luminosity system Quantities • Acceptance, efficiency corrected Luminosity • Offline – Store in Conditions Database • Richard Hawkings – Trigger/Physics week (3/11/06) Elizabeth Gallas
Conditions DB - basic concepts Application • COOL-based database architecture - data with an interval of validity (IOV) Indexed • COOL IOV (63 bit) can be interpreted as: • Absolute timestamp (e.g. for DCS) • Run/event number (e.g. for calibration) • Run/LB number (possible to implement) • COOL payload defined per ‘folder’ • Ttuple of simple types 1 DB table row • Can also be a reference to external data • Use channels (int, soon string) for multiple instances of data in 1 folder • COOL tags allow multiple data versions • COOL folders organised in a hierarchy • Athena interfaces, replication, … COOL C++, python APIs, specific data model SQL-like C++ API Relational Abstraction Layer (CORAL) Oracle DB MySQL DB SQLite File Frontier web Online, Tier 0/1 Small DB replicas File-based subsets http-based Proxy/cache Elizabeth Gallas
What happens at Point 1/Tier 0 • Assume • Run Structure as in Run Structure Report. • An Inclusive Streaming model • Complications of Exclusive model – later… • At Point 1 (ATLAS online): • 5-10 data loggers open/close files on LumiBlock boundaries • lots of small files in byte stream format • At Tier 0, small files combined • Files contain one/more complete LBNs • Ignored in this simple model: • BCID, Stream… dependence Elizabeth Gallas
What happens at Tier 1, beyond • Pt 1/ Tier 0 makes files respecting LBN boudaries • Subsequent processing must allow us to collect and track complete data sets for physics analysis in well defined sets of LBNs • Our bookkeeping must track LBNs • By EVENT / TAG (Jack) • By Metadata / File (Marjorie) Elizabeth Gallas
“Exclusive Streaming” - Online • Online: • More acCOUNTing • Count events by run / lbn / trigger • AND new index: stream • Smooth online running depends on • Accurate predictions of rates and overlaps • Trigger rates depend on fully operational • Triggers and detectors • Balancing streams • Empty file syndrome (empty or lost?) • Special runs with special prescales require stream balance analysis • How do we keep rates to one stream from exceeding ?? % • How do we keep rates to one stream from dropping < ?% • Heightens importance of predictions of rates / overlaps • As trigger configurations evolve • Prescales • Thresholds Elizabeth Gallas
“Exclusive Streaming” - Offline • Offline: • Implications for TAG database (Jack). • Splits trigger samples to different Streams • LBNs - distributed onto 1/more file(sets) • Note: a‘topological’ split, not just random • Complicates: tracking of processing history (parentage) • Should processing for different streams occur at different Tier 1 sites • Successful analysis depends on Grid coherence • Robustness should include ability to recover a lower precision luminosity should losses occur Elizabeth Gallas
“Exclusive Streaming” (3) • In an ideal world, might be manageable, but infrastructure will be quite complex and day-to-day situations will conspire to foil the best laid plans • Perhaps after a few ‘months?’ of data taking • but bookkeeping much more complicated • The Streaming Test will not answer: • How the system might be actually (ab)used • But has experience with ‘grid coherence’ Make Rules, lots of Rules. Hope people understand, follow, or they will invent their own tools… Elizabeth Gallas
Summary • FACT: A physics event dataset comprised of or related to the complete set of events passing a trigger recorded in an LBN has a corresponding measurable integrated luminosity. • Any dataset which does not DOES NOT • Unless that process cross section can be related to a known process cross section taken in the same interval We measure luminosity in Luminosity Blocks (smallest complete fundamental unit of ATLAS data taking) • Pt 1/ Tier 0 makes files respecting LBN boudaries • Subsequent processing must allow us to collect and track complete data sets for physics analysis in well defined sets of LBNs • Our bookkeeping must track LBNs by • EVENT / TAG (Jack) • Metadata / File (Marjorie) • Data Quality • Naturally indexed by Run/LBN (or Run/LBN range) • Any other index must have a direct relationship to Run/LBN Elizabeth Gallas
Conclusion • Required: Robust Operational Model including a Luminosity Block based (or aware) • File management and bookkeeping • Data analyses • Data quality • Detector, trigger, reconstruction • Having it from day 1 • Enables us to debug problems faster • Trigger cross sections are constant • Measuring their rates helps monitor stability • Streaming Models • The devil is in the details – ‘exclusive streaming’ by physics / trigger will • Initially make it difficult to find problems • Be an ongoing challenge to operations and bookkeeping Elizabeth Gallas