1 / 21

A Uniform and Coherent Approach to Object Persistency

A Uniform and Coherent Approach to Object Persistency. Vincenzo Innocente. User Tag (N-tuple). Tracker Alignment. Ecal calibration. Tracks. Event Collection. Collection Data. Electrons. Event. HEP Data. Environmental data Detector and Accelerator status Calibrations, Alignments

sana
Download Presentation

A Uniform and Coherent Approach to Object Persistency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Uniform and Coherent Approachto Object Persistency Vincenzo Innocente

  2. User Tag (N-tuple) Tracker Alignment Ecal calibration Tracks Event Collection Collection Data Electrons Event HEP Data • Environmental data • Detector and Accelerator status • Calibrations, Alignments • Event-Collection Data (luminosity, selection criteria, …) • … • Event Data, User Data Navigation is essential for an effective physics analysis Complexity requires coherent access mechanisms Software Strategy

  3. Later selected DAQ Not in original design Later more filters to DVNs and Ntpule Software Strategy

  4. CMS Experiment-Data Analysis Quasi-online Reconstruction Environmental data Detector Control Online Monitoring store Request part of event Store rec-Obj Request part of event Event Filter Object Formatter Request part of event store Persistent Object Store Manager Object Database Management System Store rec-Obj and calibrations store Request part of event Data Quality Calibrations Group Analysis Simulation G3or G4 User Analysis on demand Software Strategy

  5. Uniform approach • Coherent data access model • same mechanisms, same language, same transaction model • Save effort • A single team of experts • A single team of administrators • Leverage experience • developers can easily move from one application to another (from event-data to calibration-data applications) • Reuse design and code • Basic requirements are often the same • We can use the same code to manage event data, calibrations, “n-tuple” • Main road in producing better and higher quality software Software Strategy

  6. Reconstruction Sources Software Strategy

  7. Algorithm Algorithm Algorithm Rec Objs Rec Objs Rec Objs CMS Reconstruction Model Geometry Conditions Sim Hits Raw Data Detector Element Event Digis Rec Hits Algorithm Software Strategy

  8. Software Strategy

  9. Vector of Digi Vector of Digi Index Raw Event RawData are identified by the corresponding ReadOut. RawData belonging to different “detectors” are clustered into different containers. The granularity will be adjusted to optimize I/O performances. An index at RawEvent level is used to avoid the access to all containers in search for a given RawData. A range index at RawData level could be used for fast random access in complex detectors. RawEvent ReadOut ReadOut ... RawData RawData Index implemented as an ordered vector of pairs Software Strategy

  10. Reconstruction Object Model All persistent objects are managed by CARF. Physics Modules access them through standard C++ pointers Software Strategy

  11. CMS Reconstructed Objects Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor. RecEvent A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.). The top level object acts as a proxy. Intermediate reconstructed objects (RHits) are cached by value into the final objects . S-Track Reconstructor “esd” Track SecInfo “rec” S Track .. Track Constituents “aod” Vector of RHits S Track Software Strategy

  12. CARF2000 Event Structure Software Strategy

  13. CMS Event Structure Persistent Event Collection Event Collection Transient Run RecEvent RecEvent In case of re-reconstruction the original structure is kept. Event objects are cloned and new collections created RawEvent RecEvent RecEvent Software Strategy

  14. Physical clustering Software Strategy

  15. CMS needs a real DBMS • An experiment lasting 20 years can not rely just on ASCII files and file systems for its production bookkeeping, “condition” database, etc. • Even today at LEP, the management of all real and simulated data-sets (from raw-data to n-tuples) is a major enterprise • Multiple models used (DST, N-tuple, HEPDB, FATMAN, ASCII) • A DBMS is the modern answer to such a problem • An ODBMS providesa coherent and scalable solution for managing all kind of data • seamless integration with OO languages • internal navigation capability Software Strategy

  16. CMS Experience • CMS has used Objectivity/DB for the current prototype activity in close contact with IT in the context of the RD45 project • Database Developers (just OO and C++) : • Designing and implementing persistent classes not harder than for native C++ classes. • Physics Software Developers (do not see Objectivity) : • Persistent objects are accessed using standard C++ • Same code can access either persistent or transient object • Framework (easy to manage DB) : • Flexible and transparent distinction between logical associations and physical clustering. • Fully transparent I/O with performances essentially limited by the disk speed (random access). Software Strategy

  17. CMS Experience • Administration (essentially file management) : • Very flexible file-level management (localization, archival, replication) using AMS features • Several tools available to monitor activities and performance • File size overhead (5% for realistic CMS object sizes) not larger than for other “products” • Physicists (easy to use) : • Personal Databases are invaluable and in common use • Analysis performance and flexibility improved by shallow (link) & deep (data) local copy of selected event sample • use same type of event-catalog as production • Framework and CMS tools hide all details • All our tests show that Objectivity/DB can satisfy CMS requirements in terms of performance, scalability and flexibility for all kind of data Software Strategy

  18. Alternatives: other ODBMS • Versant is a viable commercial alternative to Objectivity • do we have time to build an effective partnership (eg. MSS interface)? • Espresso (by IT/DB) should be able to produce a fully fledged ODBMS in a couple of years once the proof-of-concept prototype is ready • Migrate CARF from Objectivity to another ODBMS • We expect that it would take about one year • Will not affect the basic principles of CMS software architecture and data model • Will involve only the core CARF development team. • Will not disrupt production and physics analysis Software Strategy

  19. Alternatives: ORDBMS • ORDBMS (Relational DB with OO interface) are appearing on the market Up to now they looked targeted to those who have already a relational system and wish to make a transition to OO • A New ORACLE product has all the appearances of a fully fledged ODBMS • IT/DB is in the process of evaluating this new product as an event store If it will look promising CMS will join this evaluation next year. • We will consider the impact of ORDBMS on CMS Data Model and on migration effort before the end of 2001 Software Strategy

  20. Fallback Solution: Hybrid Models • We believe that this solution could seriously compromise our ability to perform our physics program competitively • (R)DBMS for Event Catalog, Calibration, etc • Object-Stream files for event data • Ad-hoc networked data-server and MSS interface • Less flexible • Rigid split between DBMS and event data • One way navigation from DBMS to event data • More complex • Two different I/O systems • More effort to learn • More resources for developing and maintaining our application software • This approach will be used by several experiment at BNL and FermiLab (RDBMS not directly accessible from user applications) • CMS is following closely these experiences. Software Strategy

  21. Conclusion • CMS has chosen to follow a uniform and coherent approach for the development of Experiment-Data Analysis Software • Today a Functional Prototype exists and includes • A modular Object Oriented Framework • A Service and Utility Toolkit • A Persistent Object Service based on Objectivity/DB • Specialized applications for DAQ, Simulation, Reconstruction and Visualization • A set of plug-in modules for detector and physics simulation, reconstruction and analysis • CMS is currently reviewing the present architecture, the software design and the technical choices to prepare for next software development cycle Software Strategy

More Related