340 likes | 465 Views
QCDml Tutorial. How to mark up your configurations. Contents. FAQs on using XML schema Defining QCDml Namespaces and validation Example XML IDs Ensemble and config Ensemble: actions, algorithms and management metadata Config: what goes where Babble about BinX Metadata catalogue demo.
E N D
QCDml Tutorial How to mark up your configurations Chris Maynard
Contents • FAQs on using XML schema • Defining QCDml • Namespaces and validation • Example XML IDs • Ensemble and config • Ensemble: actions, algorithms and management metadata • Config: what goes where • Babble about BinX • Metadata catalogue demo Chris Maynard
FAQs about XML schema • What is XML schema? • Collection of rules for XML documents • An XML schema is itself an XML document • Why do we need an XML schema? • Computers can read and understand XML IDs • <length>16</length> • Meaning of length is context dependent • Do I need to learn XML schema • No. Schema makes it easier to produce XML Chris Maynard
QCDml1.0 • Metadata split into two schemata • Ensemble XML <markovChain/> • Config XML <gaugeConfiguration/> • N.B. use lowerCamelConvention • ILDG website for XML schema files • http://www.lqcd.org/ildg • Go to Metadata and follow links • Version 1.0 online and ready to use Chris Maynard
Namespaces • Example XML ID for UKQCD data • XML Namespace defined by W3.org as • A collection of names identified by a URI reference Chris Maynard
First namespace • URI defines namespace for QCDml • This is the default namespace • All elements of QCDml belong to this namespace Chris Maynard
Second namespace • Namespace of XML schema itself • Prefix <xsi:> for elements of XML schema • XML ID is valid against WC3 XML schema Chris Maynard
SchemaLocation • The namespace of the schema • The file which contains the schema • URI namespace can be URL of the schema instance – not compulsory Chris Maynard
Logical filename • Unique URI for a file in a namespace • Uniquely identifies this ensemble in ILDG namespace Chris Maynard
Validation • Verify XML ID is valid against a schema • Schema aware applications can use XML ID • Can write XML in vi,emacs etc • CMM uses XMLSpy for schema and ID manipulation • built in validator, create XML ID from schema • http://www.w3.org/XML/Schema • Many different tools Chris Maynard
QCDml Ensemble <action/> UML representation of schema Split into quark and gluon sections Chris Maynard
Ensemble XML - actions • Inheritance tree - check for your action in schema Chris Maynard
Which elements? • Schema defines required elements • UKQCD NP clover Chris Maynard
UKQCD Ensemble example Glossary: not computer readable How cSW was determined References etc Chris Maynard
NumberOfFlavours Number of degenerate flavours for which these coupling values apply Chris Maynard
MILC 2+1 staggered Ensemble <couplings/> is array valued Non-degenerate flavours shown with different couplings Mass 0.02 Mass 0.05 Chris Maynard
Management • Metadata created when Ensemble registered with ILDG • Yet to be created middleware will do this Chris Maynard
Algorithm • Algorithmic metadata split between ensemble and algorithm • Most metadata is unconstrained parameter <name/> <value/> pairs • Relevant information can be found • Glossary document for references etc • Hierarchical structure for algorithms is • difficult to create • difficult to make extenisble Chris Maynard
Algorithm: Example Glossary for detailed information Unconstrained parameter <name/> <value/> pairs Chris Maynard
Config XML Machine and code details In principle these could be different for configurations in the same ensemble Chris Maynard
Config Management Checksum for config binary Zeroeth <revision/> is generate data, as this occurs before submission to ILDG Chris Maynard
Precision Precision (double or float) in which the calculation was done Chris Maynard
markovStep Logical File name of the ensemble in the ILDG namespace Chris Maynard
dataLFN Logical File name of the configuration in the ILDG namespace Chris Maynard
The markov chain Where the configuration is in the trajectory of markov chain Chris Maynard
avePlaquette Very useful metadata, can be used to check data transformations are correct Chris Maynard
Config: UKQCD example Application codes can write this info either as QCDml Or tool can convert the IO to QCDml Chris Maynard
BinX • XML markup for binary data • Library for manipulating marked up data • Production codes do not use BinX library • But easy to mark up data format in BinX style • ILDG middleware can use BinX for data manipulations • http://www.edikt.org/binx • BinX under discussion by Middleware + Metadata WG for file format. Chris Maynard
Gauge config BinX Small Written once per ensemble write code on top of BinX library Change array order 2x3 3x3 average plaquette ILDG BinX based gauge config manipulator? Chris Maynard
Correlator data Compact. No standard shape to correlators BinX will read in any shape Chris Maynard
Array stripper BinX + BJ’s Xpath reader Code reads this XML Produces single slice array in text/XML From any size/shape array Schema for correlator channels ILDG middleware extract channel from any correlator Chris Maynard
Correlator dictionary • Possible QCDml extension • Correlator AP code knows channel details • IO AP write dictionary • Channel n is zero p pion • User requests pion • Stripper reads dictionary to find pion • Pulls channel n from correlator • Very easy to read other peoples data! Chris Maynard
Metadata demonstration • UKQCD metadata catalogue • Browser is based on OGSA-DIA • Open source • You can get it at www.forge.nesc.ac.uk • Browser reads the schema • Build XPath query graphically • Result handler • Display XML and GET data • Render web page of results? • Create XML IDs? Chris Maynard
ILDG metadata • ILDG proposal: • All collaborations publish metadata • Example method • UKQCD metadata catalogue access is not authenticated • Anyone can read it • ILDG aggregation of metadata catalogues • Mark up data in QCDml • No extra effort required. Chris Maynard