1 / 24

CSE 636 Data Integration

Learn about XML Schema, DTDs, XML namespaces, elements vs. types, local vs. global definitions, and using regular expressions in XML schema. Explore W3C recommendations and alternative proposals.

helengold
Download Presentation

CSE 636 Data Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 636Data Integration XML Schema

  2. XML Schemas • W3C Recommendation: http://www.w3.org/XML/Schema • Generalizes DTDs • Uses XML syntax • Two documents: structure and datatypes • http://www.w3.org/TR/xmlschema-1 • http://www.w3.org/TR/xmlschema-2 • XML-Schema is very complex • often criticized • some alternative proposals

  3. XML Schemas <?xml version="1.0"?> <xsd:schemaxmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:elementname=“paper” type=“papertype”/> <xsd:complexTypename=“papertype”> <xsd:sequence> <xsd:elementname=“title” type=“xsd:string”/> <xsd:elementname=“author” minOccurs=“0”/> <xsd:choice> <xsd:elementname=“journal”/> <xsd:elementname=“conference”/> </xsd:choice> </xsd:sequence> </xsd:complexType> </xsd:schema> DTD: <!ELEMENT paper (title, author?, (journal|conference))>

  4. XML Namespaces • http://www.w3.org/TR/REC-xml-names (1/99) • Solve the problem of tag name conflicts • name ::= [prefix:]localpart <bookxmlns:isbn=“www.isbn-org.org/def”> <title> … </title> <number> 15 </number> <isbn:number> …. </isbn:number> </book>

  5. XML Namespaces defined here • Syntactic: <number> , <isbn:number> • Semantic: provide URL for schema <tagxmlns:mystyle = “http://…”> … <mystyle:title> … </mystyle:title> <mystyle:number> … </tag>

  6. Elements vs. Types in XML Schema <xsd:elementname=“person” type=“pType”/> <xsd:complexType name=“pType”> <xsd:sequence> <xsd:elementname=“name” type=“xsd:string”/> <xsd:elementname=“address”type=“xsd:string”/> </xsd:sequence></xsd:complexType> <xsd:elementname=“person”> <xsd:complexType> <xsd:sequence> <xsd:elementname=“name” type=“xsd:string”/> <xsd:elementname=“address”type=“xsd:string”/> </xsd:sequence> </xsd:complexType></xsd:element> DTD: <!ELEMENT person (name,address)>

  7. Elements vs. Types in XML Schema • Types: • Simple types (integers, strings, ...) • Complex types (regular expressions, like in DTDs) • Element-type-element alternation: • Root element has a complex type • That type is a regular expression of elements • Those elements have their complex types... • ... • On the leaves we have simple types

  8. Local vs. Global Types in XML Schema • Local type: • <xsd:elementname=“person”> [define locally the person’s type]</xsd:element> • Global type:<xsd:elementname=“person” type=“pType”/><xsd:complexType name=“pType”> [define here the type pType]</xsd:complexType> Global types: can be reused in other elements

  9. Local vs. Global Elementsin XML Schema • Local element: • <xsd:complexType name=“pType”> <xsd:sequence> <xsd:elementname=“address” type=“...”/>... </xsd:sequence></xsd:complexType> • Global element:<xsd:elementname=“address” type=“...”/><xsd:complexType name=“pType”> <xsd:sequence><xsd:elementref=“address”/> ... </xsd:sequence></xsd:complexType> Global elements: like in DTDs

  10. Regular Expressions in XML Schema Recall the element-type-element alternation: <xsd:complexType name=“....”> [regular expression on elements]</xsd:complexType> Regular expressions: <xsd:sequence> A B C </...> = A B C <xsd:choice> A B C </...> = A | B | C <xsd:group> A B C </...> = (A B C) <xsd:… minOccurs=“0” maxOccurs=“unbounded”>…</…>= (...)* <xsd:… minOccurs=“0” maxOccurs=“1”>…</…> = (...)?

  11. Local Names in XML Schema <xsd:elementname=“person”> <xsd:complexType> … <xsd:elementname=“name”> <xsd:complexType> <xsd:sequence> <xsd:elementname=“firstname” type=“xsd:string”/> <xsd:elementname=“lastname” type=“xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element> … </xsd:complexType></xsd:element> <xsd:elementname=“product”> <xsd:complexType> … <xsd:elementname=“name” type=“xsd:string”/> </xsd:complexType></xsd:element> name has different meanings in person and in product

  12. Subtle Use of Local Names <xsd:elementname=“A” type=“oneB”/> <xsd:complexType name=“onlyAs”> <xsd:choice> <xsd:sequence> <xsd:elementname=“A” type=“onlyAs”/> <xsd:elementname=“A” type=“onlyAs”/> </xsd:sequence> <xsd:elementname=“A” type=“xsd:string”/> </xsd:choice></xsd:complexType> <xsd:complexType name=“oneB”> <xsd:choice> <xsd:elementname=“B” type=“xsd:string”/> <xsd:sequence> <xsd:elementname=“A” type=“onlyAs”/> <xsd:elementname=“A” type=“oneB”/> </xsd:sequence> <xsd:sequence> <xsd:elementname=“A” type=“oneB”/> <xsd:elementname=“A” type=“onlyAs”/> </xsd:sequence> </xsd:choice></xsd:complexType> Arbitrary deep binary tree with A elements, and a single B element

  13. Attributes in XML Schema <xsd:elementname=“paper” type=“papertype”/> <xsd:complexTypename=“papertype”> <xsd:sequence> <xsd:elementname=“title” type=“xsd:string”/> … </xsd:sequence> <xsd:attribute name=“language" type="xsd:NMTOKEN" fixed=“English"/> </xsd:complexType> • Attributes are associated to the type,not to the element • Only to complex types • More trouble if we want to add attributes to simple types

  14. Adding Attributes to Simple Types <xsd:elementname="B"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:string"> <xsd:attribute name="testAttr“type="xsd:string"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element>

  15. “Mixed” Content, “Any” Type • Better than in DTDs: can still enforce the type, but now may have text between any elements • Means anything is permitted there <xsd:complexTypemixed="true">… <xsd:elementname="anything" type="xsd:anyType"/>…

  16. “All” Group <xsd:complexTypename="PurchaseOrderType"> <xsd:all> <xsd:elementname="shipTo" type="USAddress"/> <xsd:elementname="billTo" type="USAddress"/> <xsd:elementref="comment" minOccurs="0"/> <xsd:elementname="items" type="Items"/> </xsd:all> <xsd:attributename="orderDate" type="xsd:date"/> </xsd:complexType> • Restrictions: • Only at top level • Has only elements • Each element occurs at most once • E.g. “comment” occurs 0 or 1 times

  17. Derived Types by Extensions <complexTypename="Address"> <sequence> <elementname="street" type="string"/> <elementname="city" type="string"/> </sequence> </complexType> <complexTypename="USAddress"> <complexContent> <extensionbase="ipo:Address"> <sequence> <elementname="state" type="ipo:USState"/> <elementname="zip" type="positiveInteger"/> </sequence> </extension> </complexContent> </complexType> • Corresponds to inheritance

  18. Derived Types by Restrictions • may restrict cardinalities, e.g. (0,infty) to (1,1) • may restrict choices • other restrictions… • Corresponds to set inclusion <complexContent> <restrictionbase="ipo:Items“> … [rewrite the entire content, with restrictions]… </restriction> </complexContent>

  19. Simple Types string token byte unsignedByte integer positiveInteger int larger than integer unsignedInt long short ... time dateTime duration date ID IDREF IDREFS

  20. Facets of Simple Types Examples length minLength maxLength pattern enumeration whiteSpace maxInclusive maxExclusive minInclusive minExclusive totalDigits fractionDigits • Facets = additional properties restricting a simple type • 15 facets defined by XML Schema

  21. Facets of Simple Types • Can further restrict a simple type by changing some facets • Restriction = subset

  22. Not so Simple Types • List types: • Union types • Restriction types <xsd:simpleTypename="listOfMyIntType"> <xsd:listitemType="myInteger"/> </xsd:simpleType> <listOfMyInt>20003 15037 95977 95945</listOfMyInt>

  23. Summary of XML Schema • Formal Expressive Power: • Can express precisely the regular tree languages (over unranked trees) • Lots of other stuff • Some form of inheritance • A “null” value • Large collection of data types

  24. References • Lecture Slides • Dan Suciu • http://www.cs.washington.edu/homes/suciu/COURSES/590DS/12xmlschema.htm • BRICS XML Tutorial • A. Moeller, M. Schwartzbach • http://www.brics.dk/~amoeller/XML/index.html • W3C's XML Schema homepage • http://www.w3.org/XML/Schema • XML School • http://www.w3schools.com

More Related