1 / 11

Data Format Description Language (DFDL) WG

Data Format Description Language (DFDL) WG. Martin Westhead EPCC, University of Edinburgh M.Westhead@epcc.ed.ac.uk. Overview. Background Motivation Approach Current status. Motivation. There will never be a standard data format E.g. XML – verbose, tree-based, explicit structure

naiya
Download Presentation

Data Format Description Language (DFDL) WG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh M.Westhead@epcc.ed.ac.uk

  2. Overview • Background • Motivation • Approach • Current status

  3. Motivation • There will never be a standard data format • E.g. XML – verbose, tree-based, explicit structure • Legacy formats • Application specific formats • One size will never fit all • But could we provide a language for describing formats • Transparency of physical representation • Automatic format conversion • Unambiguous description of data

  4. There’s more… Explicit structure enables: • Standard transformation to/from XML representation • Could allow application to read/write XML • But provide underlying efficient binary representation • Data stream/file becomes database • Point to parts of the structure • Extract parts of the structure • Modify parts of the structure • Integrate parts of different structures

  5. And more… • Generic tools possible • Browsing • Conversion and transformation • Annotation of data • E.g. identify bits that depict hurricane in an image • Enables general semantic labels, many ontologies could be developed e.g.: • S.I. units, SQL types, Time • Community specific labels, “starClass = whiteDwarf” • Application specific labels, “nodeColour = green” • Could lead to a standard transformation language

  6. Not fairy tales • Based on implemented work • BinX http://www.edikt.org/binx/ • BFD part of the Scientific Annotation Middleware project (http://www.scidac.org/SAM/) • ESML http://esml.itsc.uah.edu/ • Generalized and extended a little • Clear semantics • Foundation for extensibility

  7. Layers Fortran C/C++ Java API • Data Model • Structure • Primitives Data Model Transformations Binary file Text file Data stream

  8. Approach • Data model • XML infoset • Obvious way to describe it: XSD • API • DOM/SAX • Extended to provide non-string value access • Transformations • Ontology of predefined transformations (extensible) • XML language for: • Composition • Attaching to file contents • Populating the model

  9. Or to put it another way… • XSD defines models for XML documents • DFDL extends XSD to define models for data in different formats • Efficient read/write access to binary and text data sources using DOM/SAX

  10. Current status • WG status • Formed 1 year ago • 6 months on a false start • First draft expected GGF11 • Key discussion: • Mapping/transformation language • Linking mechanisms • XML representation • Flexibility

  11. Getting involved • Webpages: http://forge.gridforum.org/projects/dfdl-wg/ • Mailing list (dfdl-wg@gridforum.org) • My address: M.Westhead@epcc.ed.ac.uk

More Related