130 likes | 252 Views
XML and XML in DLESE. Katy Ginger November 2003. XML Purpose. Provide a container for data that is presentation and platform independent Container for data that is flexible and extensible; user defines the tags and content
E N D
XML and XML in DLESE Katy Ginger November 2003
XML Purpose • Provide a container for data that is presentation and platform independent • Container for data that is flexible and extensible; user defines the tags and content • A single container for data that has multiple purposes and uses in a variety of software or web applications Note: XML databases exist now
What is XML data? • Is called instance documents • Consists of user defined tags • Is well-formed and valid • The content that can be defined and controlled See DLESE Annotation Metadata Record example
Built-in Primitive Types • Strings: e.g. strings • Binary: e.g. boolean • Numeric: e.g. decimal, float, double from which integer is derived • Date/time: e.g. date, dateTime, duration, time
Correct <car> <make>Dodge</make> <model>Spirit</model> <year>1994</year> <owner> <name>you</name> <plate>CO</plate> </owner> </car> Incorrect <car> <make>Dodge</make> <model>Spirit</model> <year>1994 <owner> <plate>CO</plate> <name>you</name> </car> </owner> Well-formed and valid XML
DTD: Document Type Definition Describe the elements of XML instance documents Not well-formed XML Some data-typing Namespaces harder to deal with Schemas Describe the elements of XML instance documents Well-formed XML Strong data-typing Namespaces are easier to deal with DTD, Schemas & Namespaces Namespace: Collection of related element names identified by a name label (e.g. dc:title where dc is for Dublin Core)
XML Schema Design Philosophy • Decide where to apply XML, it’s good for: • Data archiving • Message passing • Presentation documents • Decide between global & local XML declarations • Russian Doll Design • Salami Slice Design • Venetian Blind Model
Russian Doll Design • Schema mirrors structure of the instance doc • Elements are declared inside parent elements • All elements are local in scope • Elements may appear multiple times but can’t be referenced elsewhere • Changes on an element affect only the content model (parent element) of the element being changed • Hides namespace complexities
Salami Slice Design • Each element/attribute is declared globally • Content models are then pieced together through references • Elements can be used anywhere and in multiple schemas • Changes on an element affect it everywhere the element is used • Does not hide namespaces
Venetian Blind Model • Elements/attributes are defined as types (simple or complex) • Content models are made of types • Types are reusable and extendable • Changes on an element affect its type and where the type is used • Namespaces can be hidden or not hidden
How to control tag content • Use restriction and enumeration elements • Use regular expressions See DLESE Annotation Metadata Record example
How DLESE uses XML • DLESE systems act on metadata • All DLESE metadata is stored as flat XML files • DLESE harvests metadata XML files from other digital libraries • DLESE provides metadata XML files to other digital libraries • DLESE transitioned to schemas in Spring 2003 • DLESE schemas use the Venetian Blind Model
XSLT (Extensible Stylesheet Language Transformation) • Acts on valid and well formed XML documents like instance docs and schemas • Used to create new text, HTML or XML docs • DLESE uses it to crosswalk our metadata format to the Dublin Core metadata format