110 likes | 220 Views
The CDF Run II Data Catalog and Data Access Modules. P. Calafiura, J. Kowalkowski, S. Lammel, M. Lancaster, F. Ratnikov, E. Sexton-Kennedy, I. Sfiligoi, T. Watts, E. Wicklund. Data Handling Software Components. Storage Management. S. Lammel - C 366. Data Management. Data Access Hierarchy.
E N D
The CDF Run II Data Catalog and Data Access Modules P. Calafiura, J. Kowalkowski, S. Lammel, M. Lancaster, F. Ratnikov, E. Sexton-Kennedy, I. Sfiligoi, T. Watts, E. Wicklund
Data Handling Software Components Storage Management • S. Lammel - C 366 Data Management
Data Access Hierarchy • Data view • Dataset • Run Section • Storage view • (Tape) Stream • Fileset/Partition • File
Reading Data Transparent Storage Management Logical Data Selection
Writing Data Temporary disk space management Fileset Creation Log progress
The File Catalog • Locate file(set)s belonging to a dataset from • a time range • a run range • applying quality cuts, … • Log output files and filesets info • Maintain tape management info • Log job progress (error recovery, checkpoint-restart) • C++ API • Command-line and web based tools • Distributed access
The File Catalog Clients DFC DBManager Data Logger Offline Farm L3 Farm Reader Writer Filtered Data Data Logger Raw Data Writer Oracle MSQL
The DBManager Package • J. Kowalkowski C236 Poster • DBMS-independent C++ API (calibration,geometry,DFC) • type-safe mapping table rows transient C++ objects • smart pointers • lazy instantiation • caching • update pointer when new key notified • pluggable factory to select DBMS at run time • code generator • provide binding (Oracle, MSQL, JDBC, text) for predefined queries • java-based table description language
Data Handling Input Module • Module of the Babar/CDF AC++ framework • Invisible to users • Select relevant filesets in a logical fashion • Iterate over them • stage ahead • out-of-order • Mantain state of request for error recovery
Data Handling Output Module In Out • AC++ Module • close files at target size but • aligned to run section boundaries (keep events from a section together) • Log output files info into catalog • Commit blocks of completed files to the DIM
Data Logger Offline Farm L3 Farm Filtered Data Data Logger Raw Data Status and Outlook • Defined Interfaces between all components • All components have at least a prototype implementation • Successful system integration for Mock Data Challenge 1 • T. Watts C 268 (tomorrow) • Improve performance and reliability