240 likes | 346 Views
An ODBMS approach to persistency in CMS. Lucia Silvestris INFN Bari - CERN/EP CHEP 7 - 11 February 2000 Padova Italy. CMS - Software Components. Request asynchronous data Environment Data. Slow Control. Online Monitoring. CMS Detector (Muon, Tracker, Calo). Quasi-online
E N D
An ODBMS approach to persistency in CMS Lucia Silvestris INFN Bari - CERN/EP CHEP 7 - 11 February 2000 Padova Italy
CMS - Software Components Request asynchronous data Environment Data Slow Control Online Monitoring CMS Detector (Muon, Tracker, Calo) Quasi-online Reconstruction store Request part of event Request part of event Filter Unit/ Event Filter Objectivity Formatter Store rec-Obj Request part of event Persistent Object Store Manager Object Database Management System Request asynchronous data Store rec-Obj calibration store Request part of event Data Quality Calibrations Group Physics Analysis Simulation G3 and or G4 User Analysis on demand Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Object of this talk CARF Components • CARF Architecture: On-demand reconstruction • (see V.Innocente talk on CARF Architecture-session A) • Framework Main Services • Define the events to be dispatched (events and geometry from Simulations or Test-Beams) • Manage the “not yet removed” sequential components (coming from Geant3) • Run-Time Dynamic Loading is used to configure and build CARF Applications • Framework Persistency Services • Framework Ancillary Services • User Interface, Error Report, Logging facilities,... • Timing facility, Utility library Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
ORCA CMS Persistency history • Prototype 1997-98 • Test Beams DAQ and Analysis using Objectivity/DB in different CMS Test-Beam areas (H2, T9 and X5b). • The system was successfully tested. • Production 1999 • Test Beam DAQ (from April ‘99) • Monte Carlo (GEANT3) reconstruction (from October ‘99) • Persistent digit for Calorimeter, Muon and Trigger • Physics Generator information (vertices, tracks) persistent • (see D. Stickland talk on ORCA - session A) Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
User Tag (N-tuple) Tracker Alignment Ecal calibration Tracks Event Collection Collection Meta-Data Electrons Event Persistent Service for High Energy Physics Data • Environmental data • Detector and Accelerator status • Calibrations, Alignments • Event-Collection Meta-Data (luminosity, selection criteria, …) • … • Event Data, User Data Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Do a user need a DBMS? • Do I encode meta-data (run number, version id) in file names? • How many files and logbooks I should consult to determine the luminosity corresponding to a histogram? • How easily I can determine if two events have been reconstructed with the same version of a program and using the same calibrations? • How many lines of code I should write and which fraction of data I should read to select all events with two ’s with p> 11.5 GeV and ||<2.7? • The same at generator level? If the answers scare you, you need a DBMS! Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Can CMS do without a DBMS? • An experiment lasting 20 years can not rely just on ASCII files and file systems for its production bookkeeping, “condition” database, etc. • Even today at LEP, the management of all real and simulated data-sets (from raw-data to n-tuples) is a major enterprise. A DBMS is the modern answer to such a problem and, given the choice of OO technology for the CMS software, an ODBMS (or a DBMS with an OO interface) is the natural solution. Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Blob Blob Blob A “BLOB” Model Event Event DataBase Objects RecEvent RawEvent Blob: a sequence of bytes. Decoding it is a “user” responsibility. Why should Blobs not be stored in the DBMS? Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CMS Raw Event RawData are identified by the corresponding ReadOutUnit Raw Event ReadOutUnit ReadOutUnit The ReadOutUnit Object can identify a complete detector or a detector component Raw Data Raw Data Raw Data Vector of Digi Vector of Digi Vector of Digi The vector of Digi in the Testbeam contains the ADC or TDC values ReadOutUnit Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Persistent Object Management • Thepersistent object managementis amajor responsibility in the CMS Analysis and Reconstruction Framework (CARF) • CARF manages • multi-threaded transactions • creation of databases and containers • meta data and event collections • physical clustering of event objects • persistent event structure and its relations with the transient • Use of Database is transparent to detector developers • users access persistent objects through C++ pointers • CARF takes care of memory pinning Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CMS Event Structure The Runobject contains event collection condition like Beam energy, particle type, magnetic field etc.. Persistent Event Collection Event Collection Transient Run In case of re-reconstruction the original structure is kept. Event objects are cloned and new collections created Event Event Event Event RawEvent RecEvent RecEvent The event header object contains event num, spill num, event num in the spill Event Header RecEvent RecEvent Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CMS Reconstructed Objects Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor. RecEvent A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.). The top level object acts as a proxy. Intermediate reconstructed objects (RHits) are cached by value into the final objects . S-Track Reconstructor “esd” “rec” Track SecInfo Track Constituents S Track .. “aod” Vector of RHits S Track Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Test Beam Production in 1999 • Detector performances studies have been the real users for • Test Beams project • From April 99 to October 99 the test beam software was in production for the Tracker and the Muon reading data from VME - FastBus modules and filling one federate database for each beam line (H2b, X5b, T9) and for each data taking period. • Some system databases • Beam configuration : Read-Out Unit list • LogBook: logbook information for each run • ListRuns: run list • Run Databases: event collection with the same data taking conditions • The DAQ system + Objectivity formatter running on Solaris • More than 800 GB of data stored in Objectivity/DB • Ran without major problems Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Test Beam Production in 1999 Online Offline - cmsc01 Prod Boot Prod Boot Clone FD Prod FD Prod FD BConfDB BConfDB RunDB RunDB LogDB LogDB Run1 Run2 Run3 RunN Run1 Run2 Run3 RunN Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Test Beam Data Analysis • Online (Prompt data) Monitoring: • on online machine • fast feedback of the detector performances. • Offline analysis: • locally on the data server or remotely using AMS server. • During August, Tracker (X5b) test beam up to 25 concurrent users were accessing data on the offline system without any observable degradation. During 1999 Hbook Histograms and ntuples Persistent Data TB Analysis Package HBook n-tuples HTL During 2000 Moves from Hbook Histograms and ntuples, to HTL and Tags See I. Gaponenko talk on IGUANA session F Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Tracker Silicon Detector Performances Studies • Muon beams 50 GeV • Silicon non irradiated detector • APV6 Chip deconvolution mode • FED VME Modules • active area 62.5 mm x 61.5mm • thickness 300 mm • High Resistivity • strip pitch 61 mm • strip width 14 mm • implanted strips 1024 • Scl = 31.8 Ncl = 2.9 • Scl/Ncl = 10.9 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Layer Cell Muon Drift Tube Detector Performances Studies • DTBX Format • bits (0:15): Drift Time (1.04ns) [0…65535] • bit (16): Signal Edge [1=falling] • bits (17:22): Cell Number [1..63] • bits (23:25): Layer Number [1…4] • bits (26:27): SuperLayer Number [1..3] Beam Profile Cell Nb Drift Time (ns) Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Muon Trigger (BTI) Test Beam Analysis • The Muon Test Beam analysis is fully integrated with the Muon and first level trigger reconstruction. • For Bunch and Track Identifier (BTI) • comparison between real data and simulation is performed. • see C. Grandi talk on CMS Muon Trigger - session B Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Signal HEPEVT ntuples Zebra files with HITS CMSIM MC Prod. MB Objectivity Database ORCA Digitization ORCA user Analysis ORCA ntuple production User Analysis User Analysis Analysis ntuples PAW High Level Trigger Production with ORCA in 1999 DB pop. Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
ORCA High Level Trigger 2000 production • First ORCA production in October 99 was very successful (>700GB in Objy/DB), but ORCA prod 2000 must have much more functionality: • All data will be in the database • Every CMSIM run will have its objects in many database files • Single Db file contains concatenation from many CMSIM runs (64 k files Objectivity limit) • Many layers of apparently autonomous federations actually synchronized by enforcing common schema and unique DbID’s Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
High Level Trigger Processing 2000 Minimum Bias JetMet Muon …... (FZ)User G3 Hits and Tracks JetMet Each box is an independent production running in “parallel” ORCA Xings &Digis ORCA RecObjs JetMet Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Selective Tracker Digitization Trigger Calorimetry Muon Tracker Trigger Calorimetry Muon Select Trigger Calorimetry Muon Tracker Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
ORCA 2000 Db Structure One CMSIM Job, oo-formatted into multiple Db’s. For example: FZ File Few kB/ev MC Info Container #1 ~300kB/ev 1 CMSIM Job ooHit dB's ~100kB/ev Calo/Muon Hits ~200kB/ev Tracker Hits Multiple sets of ooHits concatenated into single Db file. For example: MC Info Run1 MC Info Run2 ~2 GB/file Concatenated MC Info from N runs. MC Info Run3.. Physical and logical Db structures diverge... Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Conclusions • Thepersistent object managementis amajor responsibility of CMS Analysis and Reconstruction Framework • A DBMS is required to manage the large data set of CMS(including user data) • An ODBMS is the natural choice if OO is used in all software • Once an ODBMS is used to manage the experiment data, it’s very natural to use it to manage any kind of data related to detector studies and physics analysis • Objectivity/DB has been evaluated in different prototypes which successfully stored and retrieved data (Test-Beam, simulated, reconstructed, statistical i.e histograms). • From 1999 both for Test Beam and High Level Trigger studies we are in production using Objectivity/DB. Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP