150 likes | 260 Views
Metadata Services, I/O, and Persistence . David Malon malon@anl.gov Argonne National Laboratory 20 February 2012. Metadata and metadata services. Components in support of metadata grew organically
E N D
Metadata Services, I/O, and Persistence David Malon malon@anl.gov Argonne National Laboratory 20 February 2012
Metadata and metadata services • Components in support of metadata grew organically • Use cases that motivated current Athena metadata service architecture were related to luminosity and cross section calculation • There IS a current metadata service architecture; it is not just a bricolage • BUT its design and implementation were highly constrained by (legacy) Gaudi/Athena constraints • We have an opportunity to rethink this • And we may need to in any case David Malon
Metadata and the “objects” with which they are associated • Recall that our model is to process event collections • In some senses, files are incidental • The collection of events that happen to reside in this file, or in this list of files • The collection of events pointed to by these TAGs • The collection of events coming via this pipe • <multi-process extensions—the collection of events coming from this source …> • … • Metadata are most often associated with collections of events • The lumi range from which they were selected • Conditions for events in this run • We need to support this, and maintain/retain/propagate the associations David Malon
Current metadata “flow” • Events like opening a new input file or opening a new TAG collection are asynchronous to Gaudi/Athena global state transitions • Currently use incidents, therefore, to notify listeners when these things happen • Input metadata services make input metadata objects available/retrievable in a transient input metadata store • Listeners can then do what they want • Check which new lumi blocks are being processed, for example • Listeners typically accumulate data read from input metadata • E.g., build a list of all lumi blocks used as input • At some point, they write this to an output metadata store • Currently, output metadata are written as via a shadow stream, the properties of which mirror those of an output event data stream • Same filename, for example • But the shadow stream “writes” from output metadata store at finalize rather than at the end of each event David Malon
(continued) • An outstream architecture question • Should we really be using separate streams, relying upon job configuration to keep them consistent? • Should we consider an approach in which an outstream can have multiple itemlists with different “write at” policies, and coming from different stores? David Malon
Incidents make sense, since the arrival of new event collections or the start of new files is asynchronous to Gaudi/Athena concepts of state • BUT: can we really do this in multiprocess, multithread environments? • Maybe, at least in single-reader architectures, but … • It would be helpful to have a clear model for incident handling and messaging and error handling in such environments David Malon
Peeking • We use peeking into input files extensively, principally (I believe) for job configuration purposes • At multiple stages: • Before Athena starts, to set Athena job options • After Athena starts, while components are initializing • E.g., to determine correct conditions • Before we had in-file metadata, this often involved peeking at data in the first event • Now this is less common—more can be determined from in-file metadata than in the past—but it has not gone away entirely • And this is not entirely robust, e.g., when one is skipping events or doing direct navigation to selected events • Can we (should we) put first-event metadata in in-file metadata, so one never needs to peek at the first event? David Malon
Peeking • Isn’t it true that, in general, when a typical dataset is used as input, the job configuration should be the same for all files in the dataset? • Shouldn’t we therefore be able to figure out how to configure the jobs from dataset-level metadata alone, without peeking into the data files? • And mightn’t this be more efficient as well, if, say, at the task-to-job stage, the grid could already configure the jobs? • What do we need to do to make this possible? • Can we provide a means for jobs to access task-level or input-dataset-level external metadata? • Right now a job can’t even discover the name of the input dataset • Though the Event Selector knows the name of its input file David Malon
We’re starting to peek at output files, too • Mainly for event counting • Should we be emitting metadata instead? (Pros and cons) • Should we be worried about building too many technology dependencies into our metadata peeking tools? David Malon
Metadata output • Jobs return metadata today • But where it goes is … complicated. • This began as metadata.xml files following POOL file catalog DTD. • Began as a way to record the files that were written by the job, and their GUIDs. • When additional metadata were needed, DTD constrained us to writing per-file free metadata strings • Often the same metadata for all jobs in the task • Often the same metadata for all files produced by a given job • Alvin Tan worked to improve this in transform infrastructure, but … • Tier 0 moved some of this to jobReport (pickle) files David Malon
Extensible output metadata? • Should Athena be able to emit a metadatum for return by the job? • It turns out that there is a hack in place that makes this possible • Specific string pattern to look for in grepping log files • Shouldn’t there be a service for this? • Separately, when Ilija needed to emit performance statistics and get them to a database, he developed his own machinery—writing to a special file, and not via a general service—and provided his own post-processing to get the information into AMI • Should we try to think about this problem more generally? • Or is performance metadata a unique use case, with no others foreseen? David Malon
Metadata merging • Merging metadata is often harder than merging event data • Merging event data may require no semantic knowledge—just a larger “array” of events • Like chaining TTrees with the same structure • Yeah, it’s not quite that simple, but you know what I mean • Merging metadata may require semantic knowledge • Summing event counts is a trivial example • Merging lumiblock ranges is a bit harder, and deciding whether lumiblocks are complete may be harder still • And so on • We do “hybrid” merging now—but what might we do differently, knowing that metadata will often eventually be merged? • Look at ROOT’s (new) type-specific support for merging? David Malon
Bytestream metadata • In-file metadata is different in bytestream • Header information, plus free metadata strings • What do we need to do to make this more coherent with other in-file metadata and in-file metadata architecture? David Malon
Metadata in downstream data products • Metadata has been gradually added to products downstream of AOD • D3PD • A bit ad hoc sometimes • And work continues here in PAT venues and elsewhere • Can we make our metadata storage and retrieval strategy and components more coherent? David Malon
Miscellany • An asymmetry: we can read in-file metadata from TAG files and process it “correctly” • Example: query a range of runs and lumi blocks within those runs, select events from only some of them, but retain the list of queried {run #, LB#} for cross-section calculation • But can we write in-file metadata into TAG files from Athena? • We can write in-TAG-file metadata from Oracle, and from specific event collection utilities, but … David Malon