210 likes | 249 Views
ATLAS Offline Database Architecture for Time-varying Data, with Requirements for the Common Project. David M. Malon LCG Conditions Database Workshop CERN, Geneva, Switzerland 8 December 2003. Architectural principles.
E N D
ATLAS Offline Database Architecture for Time-varying Data, with Requirements for the Common Project David M. Malon LCG Conditions Database Workshop CERN, Geneva, Switzerland 8 December 2003
Architectural principles • All data with a time interval (or run interval) of validity are managed via the same temporal database infrastructure • Sometimes people distinguish between conditions and configurations and other kinds of detector description, but (offline) users see no difference in the machinery one uses to get the conditions or the configuration in effect when an event was taken. • We refer to the underlying database infrastructure as an interval of validity database (IOV database) rather than a conditions database for two reasons: • so as not to prejudge the types of data accessible via this means, and • because it is principally a temporal database: conditions data may not reside within this database, but rather, may be stored externally to the database hosting the interval-of-validity infrastructure David M. Malon, ANL LCG Conditions Database Workshop
Architectural principles • It must be possible to assign an interval of validity to any data object accessible to standard execution frameworks (Athena, for ATLAS), independent of the technology used to store that object. • Storing an object, and assigning a validity interval to it, may be (widely) separated in time. • Example 1: Online, a configuration may be chosen from a portfolio of stored configurations, each with no inherent interval of validity. • Example 2: The expert who updates the muon geometry does not know the range of simulation runs for which it will be used. David M. Malon, ANL LCG Conditions Database Workshop
Architectural principles • It must not be necessary to copy an object in order to assign an interval of validity to it. • Example 1: A reference to the configuration selected online, which may reside in a relational database, is registered with a range of test beam runs as the interval of validity • Example 2: A reference to the muon geometry, which may be described in an XML file, is registered with a range of simulation runs as the interval of validity • Example 3: If I use the same configuration as in Example 1 or the same geometry as in Example 2 for a later range of runs, I should not need to write it a second time. David M. Malon, ANL LCG Conditions Database Workshop
Registration and mediation • The IOV service is, principally, a registration service and a mediator. • An object may be stored in any supported technology (ROOT, POOL ROOT, MySQL{NOVA}, plain ASCII or XML files, strings,…), and later registered in the IOV database. • This does not mean that all technology choices are equally sensible for all purposes • Storing the data object in the temporal database itself is one important possibility, but it is an optimization choice—it must not be a design limitation. • LHC experiments already know how to store complex objects • In ROOT directly, via POOL,… • Should not be required to solve this problem again in order to use an IOV service • Registration means associating an interval of validity, a tag/version, …, with (a reference to) the object. David M. Malon, ANL LCG Conditions Database Workshop
Mediation • On input, the IOV service mediates access to data, helping to choose the correct instance of a data object—the one with the correct timestamp and tag • In Athena, the transient IOV service • checks the current event timestamp as appropriate • consults IOV database to get a reference to the correct version of data • invokes standard Athena conversion services to put conditions objects in the transient store • “Correct” for non-specialists usually means “the endorsed one corresponding to this event” • Version/tag information is likely supplied in standard job options • Both mediated and unmediated access are possible: if one has a “direct” reference to the object of interest, it is not necessary to pass through the IOV database (mediator) to retrieve it. • One can get Version P of the ATLAS muon geometry without dealing with interval of validity databases, on the other hand, the IOV database would be used to discover that Version P was used for simulation runs [m,n] • Similar calibration example… David M. Malon, ANL LCG Conditions Database Workshop
Conditions Data Writer 1. Store an instance of data that may vary with time or run or … 2. Return reference to data 3. Register reference, assigning interval of validity, tag, … Conditions or other time-varying data IOV Database David M. Malon, ANL LCG Conditions Database Workshop
Conditions data client 4. Build transient conditions object Athena Transient Conditions Store 3. Dereference via standard conversion services Conditions data 2. Ref to data (string) 1. Folder (data type), timestamp, tag, <version> IOV Database David M. Malon, ANL LCG Conditions Database Workshop
ATLAS development strategy to date • Employ common solutions wherever possible. • ATLAS has contributed requirements and feedback to common CERN IT/DB, ATLAS, LHCb, HARP, COMPASS, … project • Lisbon TDAQ group has implemented this interface in MySQL: this is what ATLAS offline uses for its IOV database • Athena transient interval-of-validity service • checks current event timestamp, • compares to validity intervals of already-loaded time-varying objects, • triggers loading of references time-valid objects when needed • IOVDbSvc does the loading, allowing standard conversion services to build the transient object from the persistent data David M. Malon, ANL LCG Conditions Database Workshop
Notes on architecture • The architecture lets one register conditions data stored in ASCII files, ROOT files, MySQL databases, …, in an IOV database. • Is this all that it takes to be “in” the conditions database: do whatever you want, but register your objects in the IOV database? (I hope we’ll do better.) • We still need to manage all those files coherently, and catalog them. • One can imagine • Configuration data written to their own files or databases • DCS data written to their own files or databases in a possibly different way • Subdetector-specific conditions written to their own files or databases • Different simulation geometries in different ASCII (XML?) files • …other partitioning by domain… • …all registered in the same IOV service • Possible as long as one can represent a “reference” to the data object as a string David M. Malon, ANL LCG Conditions Database Workshop
Technologies • What about storage technologies for conditions objects themselves? Anything readable by our frameworks is okay in theory, but what are good choices? • For complex objects, an obvious alternative is to use the same technology that is used for event data: POOL infrastructure, with ROOT as the storage layer • For small amounts of data, one can imagine storing the data, rather than a reference to the data, in the string (blob) managed by the IOV database • ATLAS offline has used • IOV+NOVA (an ATLAS relational-database-hosted product) • IOV+POOL: expect the common project to support this • IOV+{XML strings} David M. Malon, ANL LCG Conditions Database Workshop
Storage technologies • Since we are using a relational IOV database implementation, using the same relational database is another “obvious” and attractive alternative • Are schema equally obvious? Perhaps this project can find consensus • One can imagine using the LCG SEAL dictionary for conditions object definitions, and POOL/ROOT (or POOL/{relational database}) as a storage layer • This has the advantage that users would describe event and conditions data using exactly the same tools • Conversely, it is easy to imagine a standard transient mapping (via POOL?) of simple relational table structures; with reasonable “reference” conventions, these could easily be used for data managed by the IOV database • Sometimes transient object definition has primacy; sometimes persistent table schema has primacy: we should support both cases David M. Malon, ANL LCG Conditions Database Workshop
Beyond intervals of validity • Is it obvious that intervals of validity are the right model for all temporal LHC data? • What about alarms, and periodic measurements? • If I measure pressure at times t0, t1, t2, …, it is entirely artificial to say that the pressure at t1 has an interval of validity [t1,t2) • At time t in [t1,t2), I am more likely to want the previous and next pressure measurements, or all the pressure measurements in (t-d,t+d) • No reason to say that the pressure at t1 is the valid one • Need an extended API: We would like the common project to think about this David M. Malon, ANL LCG Conditions Database Workshop
Appendix: tagging extensions • Several people (myself included) expressed concern about the limitations of the current tagging model in the common project interval of validity (“conditions”) software at the 4-5 February 2003 ATLAS database workshop • The following slides describe a modest proposal to change/extend the tagging interface, beginning with a simplified scenario that motivates this proposal • Agreed (ATLAS, LHCb: Pere Mato), but extensions not yet implemented David M. Malon, ANL LCG Conditions Database Workshop
Calibration scenario: Phase 1 • A calibration expert is experimenting with a variety of algorithms and algorithm parameters. After a calibration run, she produces calibration constants using three different algorithms, with an interval of validity that lets her apply them to a range of runs and compare the results “version” Algorithm 3 Algorithm 2 Algorithm 1 time David M. Malon, ANL LCG Conditions Database Workshop
Calibration scenario: Phase 2 • After looking at the results, she believes that Algorithm 2 is pretty good, but Algorithm 3 is the best • After the next calibration run, she therefore computes calibration constants using Algorithm 3, and assigns an interval of validity corresponding to a new range of runs “version” Algorithm 3 Algorithm 2 Algorithm 1 Algorithm 3 time David M. Malon, ANL LCG Conditions Database Workshop
Calibration scenario: Phase 3 • Just to be certain before announcing anything for collaboration-wide use, she computes calibration constants from this latest calibration run using Algorithms 1 and 2, and compares the results when these are applied to the recent runs “version” Algorithm 3 Algorithm 2 Algorithm 2 Algorithm 1 Algorithm 1 Algorithm 3 time David M. Malon, ANL LCG Conditions Database Workshop
Calibration scenario: Phase 4 • She is slightly surprised when it appears that Algorithm 2 is a better choice, and, after looking at her results from the first calibration run, she decides that the Algorithm 2 results are what should be tagged for Production • … but the two Algorithm 2 objects were NEVER the HEAD: there is nothing she could have done (unless she were prescient) with tools that tag only the HEAD “version” Algorithm 3 Algorithm 2 Algorithm 2 Algorithm 1 Algorithm 1 Algorithm 3 time David M. Malon, ANL LCG Conditions Database Workshop
Things could have been easy • What she would have liked to do was this: • When she inserted an object produced by Algorithm N, she wanted to label (tag) it “Algorithm N” at insertion time • She may not be a C++ expert, but she could certainly have added the string “Algorithm N” to her argument list inside her Algorithm N code • How would this work with overlapping intervals? • Easy: an interval added with a tag splits only intervals with the same tag (and the HEAD, if you like, for you folks who like to trust the HEAD) David M. Malon, ANL LCG Conditions Database Workshop
Comments on robustness • It is needlessly risky to build a database infrastructure that relies on order of insertion into the database • What could the calibration expert have done differently—waited until the Nth calibration run to begin her comparison, making sure she always ran her algorithms in the same order? • If someone makes a mistake, do we need to be so unforgiving? • Internal versions do not help. Even if she kept a log of everything she did, including the order in which she ran her algorithms, she might be able to guess the version numbers when intervals do not overlap—when they do, the situation is hopeless • …and it’s worse if she has a colleague exploring alternatives (but people assure me that this will never happen …) • Are we willing to bet our database on this? David M. Malon, ANL LCG Conditions Database Workshop
A question with no context • For some conditions, run ranges are the most natural intervals of validity; for others, time ranges are more natural • With some work, “real” runs can be associated with time intervals, but for simulation, this requires applying some rather arbitrary and artificial conventions (retroactively, in our case) • Query to other experiments: Would it be useful to have the project support more than one kind of validity “key,” e.g., timestamps and {run,event} ranges, or {run,event}time mapping services? David M. Malon, ANL LCG Conditions Database Workshop