ATLAS Metadata Interface

ATLAS Metadata Interface Campaign Definition in AMI S.Albrand ATLAS Metadata Interface

Story so far Requested last SW week. Details of proposed implementation circulated in January. Some examples received – but not all my questions answered. Implementation started <2 weeks ago.

"Campaigns are defined by :" "A short name (30 characters) : unique in the database A dataset project (or set of projects) A map which associates each member of a set of pairs of productionSteps and dataTypes to a set of AMI configuration tags A description (1000 chars) The dataset projects are either dataNN_* or mcNN_*. Thus datasets which do not belong to these groups cannot be part of production campaign (such as valid_*, user*, group*)" N.B. Nothing was said about streams

Defines a data campaign. Requires two arguments: campaignName - a short name (30 alpha numeric characters, no spaces) projectName - a datasetProject name ( a mistake on my part – needs a separate step) If no other argument is given a new empty campaign is created. Optional arguments: ( for an MC campaign ) pyDict - a python dictionary in Text format. (pyAMI only) campaignDictFile=filename (containing the dictionary as described above) StreamName (equivalent to physicsShort, usually omitted ) (I added this because at least one of the examplesI was given had a stream wild card) description - a long (1000 chars) description of the campaign. AddCampaign Not yet implemented

Examples: AddCampaign campagnName=mc11a projectName=mc11_7TeV pyDict="{'MC11c': {'mc11_7TeV': {'*': {'recon': {'AOD': ['r3043', 'r3060', 'r3108', 'r3072', 'r3073', 'r3074', 'r3075', 'r3076', 'r3077', 'r3078', 'r3079', 'r3080', 'r3081', 'r3082', 'r3083', 'r3084', 'r3085', 'r3086', 'r3044', 'r3110', 'r3097', 'r3071', 'r3068', 'r3070', 'r3069', 'a145', 'a146'], 'ESD': ['r3043', 'r3060', 'r3108', 'r3072', 'r3073', 'r3074', 'r3075', 'r3076', 'r3077', 'r3078', 'r3079', 'r3080', 'r3081', 'r3082', 'r3083', 'r3084', 'r3085', 'r3086', 'r3044', 'r3110', 'r3097', 'r3071', 'r3068', 'r3070', 'r3069', 'a145', 'a146']}, 'merge': {'AOD': ['r2993', 'r3109', 'r3063']}, 'digit': {'RDO': ['d621', 'd622', 'd623', 'd619']}}}}} "description='This is an example' Questions : Once the existing campaigns have been entered in AMI will anyone need this? If yes, do you need an "overwrite" function? AddCampaign campagnName=mc11a_empty projectName=mc11_7TeV [description="a description"] /* creates (reserves) an empty campaign */

Problems I received several examples of pyDict format from different people, and the formats were all a bit different. I chose Borut's format as it looked "real". My error : I have made (by error) a simplification. At the moment one campaignName is associated with exactly one project and stream. Not too difficult to correct transparently – but decided to ignore it for the moment – so that I could have something to show today.

ListCampaign ListCampaign –pyDict=true campaignName=solveig_test2 {'solveig_test2': {'mc11_7TeV': {'*': {'recon':{'AOD': ['a146', 'r3000', 'r1235', 'r2346'], 'ESD': ['r1234', 'r2345']}}}}} Or get it in standard AMI format. Questions : Does the order of the tags matter? Who reads the dict format? Can I have a copy of the reading code?

Other functions for filling a campaign Already available: AddProdStepGroup : adds a prodstep, and dataType couple and optionally a tagSet to a campaign. Rejects illegal values. AddTagSet : adds a tagSet to a prodstep, dataType couple of a campaign. Rejects undeclared tags. The other ones described in the specification will follow.They are "Updates" and "Removes". I will of course correct the treatment of projectTags. Are you sure you really want streams? (Data Prep uses the same super tag for all streams)

A few remarks & questions I suppose that there will be a phase of building up a definition with fairly frequent updates? Borut said "No notion of "closed" campaigns" How do clients want to be informed of changes in a campaign definition? What do they do with the information? Presume that if DDM is using regex to identify datasets as part of a campaign, then they can generate them themselves from a pyDict? It doesn't seem very scalable to me to mark in AMI which datasets as members of a campaign which may change at any moment (or is it always additive?)

Messy Tag  prodstep coupling I would have liked to be able to say to a client "This tag type does not go with the dataType/prodStep you provided". But the use of tags, and even prodSteps is too messy. (Double use of s tags and r tags in particular) So I am only checking that prodstep is declared and that a tag exists at the moment. I will add a warning "This tag is already in another camapign"

Next steps Make web interface (c.f. Period definition interface) Test it ? Who? Document. Release…

ATLAS Metadata Interface

ATLAS Metadata Interface

Presentation Transcript

Metadata Considerations for ATLAS Distributed Computing

A Programmatic View of Metadata, Metadata Services, and Metadata Flow in ATLAS

CDNI Metadata Interface draft- ietf - cdni -metadata-00

CDNI Metadata Interface (draft-ma-cdni-metadata-01)

LTER Metadata Query Interface – Current Status and Future Challenges

METADATA

AMI and its place in ATLAS Metadata

Metadata

Metadata

METADATA

Finding Information: Metadata in ATLAS

Metadata

ATLAS Metadata Handling and AMI Wokshop Highlights

The GANGA Interface for ATLAS/LHCb

Metadata

An Integrated Overview of Metadata in ATLAS

ATLAS MetaData

METADATA

Overview of ATLAS Metadata Tools

Metadata

LTER Metadata Query Interface – Current Status and Future Challenges

AMI and its place in ATLAS Metadata