1 / 1

New Developments in File-based Infrastructure for ATLAS Event Selection

Event Selector. Address Provider. Selector Tool. Algorithm via Converter. StoreGate. MetaDataSvc: Provide TransientAddress. Event Data. Event Data. Event Data. Event Data. Input MetaData Store. MetaDataTools: Read and Combine MetaData objects. MetaData Store. CondOutputStream:

calder
Download Presentation

New Developments in File-based Infrastructure for ATLAS Event Selection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Event Selector Address Provider Selector Tool Algorithm via Converter StoreGate • MetaDataSvc: • Provide TransientAddress. Event Data Event Data Event Data Event Data Input MetaData Store • MetaDataTools: • Read and Combine MetaData objects MetaData Store • CondOutputStream: • Write MetaData into output event data file. Event Data Event Data Payload (data) File with TAGs … … … … 2) beginFile next() preNext() 2) beginFile TAG 3) endFile: Flush Store TAGs TAGs TAGs • EventSelector: • Read TAG MetaData • Record CollectionMetadata object. • MetaDataTools: • Read and Combine MetaData objects … Tag MetaData Store Collection Tree Read TAG from File TAG Retrieve TAG from StoreGate TAGs • EventSelector: • Opens and closes data and Tag files • Fires Incidents for beginTag, beginFile, endFile and endTag. TAG Output Event Data + Attributes TAG 1) beginTag postNext() 1) beginTag Explicit Collection Data Header Return hint to skip Event alt 4) endTag: Flush Store Collection Tree POOLContainer_DataHeader load Addresses() Event Info EventTree Explicit Collection Implicit Collection ROOT TTree Read DataHeader (and EventInfo) from File Token References TAG DataHeader execute() Object_1 … Foo Container Object_N TAG DataHeader Object_1 Object_N + Attributes TAG DataHeader Object_1 Object_N Read DataObjects from File (uses DataHeader) New Developments in File-based Infrastructure for ATLAS Event Selection Figure 1: The AthenaPOOL framework has been augmented to support both event file and TAG file metadata. David Malon, Peter van Gemmeren (Argonne National Laboratory), Marcin Nowak (Brookhaven National Laboratory) Figure 3: Writing TAGs into event data files allows using TAG event selection capabilities on the data file. Figure 2: Computational processing of TAG attributes enables physicists to reject events using complex queries without reading the event data 1. Abstract In ATLAS software, TAGs are event metadata records that can be stored in various technologies, including ROOT files and relational databases. TAGs are used to identify and extract events that satisfy certain selection predicates, which can be coded as SQL-style queries. TAG collection files support in-file metadata to store information describing all events in the collection. Event Selector functionality has been augmented to provide such collection-level metadata to subsequent algorithms. The ATLAS I/O framework has been extended to allow computational processing of TAG attributes to select or reject events without reading the event data. This capability enables physicists to use more detailed selection criteria than are feasible in an SQL query. For example, the TAGs contain enough information not only to check the number of electrons, but also to calculate their distance to the closest jet--a calculation that would be difficult to express in SQL. Another new development allows ATLAS to write TAGs directly into event data files. This feature can improve performance by supporting advanced event selection capabilities, including computational processing of TAG information, without the need for external TAG file or database access. 2. Details TAG collection support for in-file metadata • Since POOL version 2.8, Collections support in-file Metadata. • Stored as Key/Value map<string, string>. • Recorded in StoreGate. • AthenaPOOL fires incident to alert for end TAG file, begin TAG file. Computational processing of TAG attributes • The ATLAS I/O framework has been extended to allow computational processing of TAG attributes to select/reject events without reading the event data (not even DataHeader or EventInfo). • This capability enables physicists to preselect events with more detailed selection criteria than are feasible in an SQL query. • For example, the TAGs contain enough information to not only check the number of electrons, but also calculate their distance to the closest jet. Iterative and procedural computations are difficult in SQL. • Data stored in TAGs suffice for invariant mass computations. Combinatorics and SQL queries don’t mix well. • Expect deselecting events based upon TAG content to be 10 times faster on average than reading the same information from AOD. • Given that the size of TAGs is only ~1KB/event, we expect substantial improvements for most analyses. • Depends upon how many containers would need to be accessed to compute the query. • In any case the DataHeader and EventInfo object would be read. • Greatest effect for highly selective analysis. TAGs within event data files • Event selection with TAGs is becoming more popular. • It would be nice to be able to apply the TAG tools directly to AOD. • Allows transparent use of TAG functionality (e.g., queries) on AOD files. • TAGs are very small compared to AOD (less than 1%). • First TAGs are produced during AOD merging. • TAGs and event data are both written via POOL using ROOT I/O. • Issues are mainly technical: • Synchronization of multiple writers to the same output file. • Can write TAG collections during finalize, after event data are done. • Distinguishabilityof Payload and TAG Collection containers • Names are configurable • ATLAS event selectors are already capable of iteration via either source • Work in progress.

More Related