620 likes | 639 Views
XML Validation II Schemas. Robin Burke ECT 360. Outline. Namespaces Documents Data types XML Schemas Elements Attributes Derived data types RELAX NG. XML so far. Languages defined by DTDs names assigned by designers OK for standalone systems Doesn't have
E N D
XML Validation IISchemas Robin Burke ECT 360
Outline • Namespaces • Documents Data types • XML Schemas • Elements • Attributes • Derived data types • RELAX NG
XML so far • Languages defined by DTDs • names assigned by designers • OK for standalone systems • Doesn't have • The ability to handle naming conflicts • The ability to partition work among different developers
Namespaces • A way to identify a set of labels • element / attribute names • attribute values • As belonging to a particular application
Example • recordings • title • artist • group | artist-name+ • date • label • artworks • title • artist • date • exhibit • books • title • author • date • publisher
Problem • Want to create a list of items related to 50s Beat-era culture • includes music, art, literature • Could create a new DTD • better to reuse existing ones
Namespace idea • Associate a short prefix with an application • Schema or DTD • Use the prefix with a colon to "qualify" names • music:artist • art:artist • book:author
Namespace idea, cont'd • A namespace is an association between • a set of names • a unique identifier (URI) • a prefix used to identify them
Namespace declaration • Standalone <?xml:namespace ns="http://bookpeople.com/book" prefix="book"?> • Part of element <html xmlns="http://www.w3.org/1999/xhtml"> • in this case, no prefix <book xmlns:book="http://bookpeople.com/book">
Namespace URI • Not a URL • there is no resource at the given location • just a unique identifier • URL-like identifiers are good • associated with an organization • must be unique on the Internet
Example • DTDs • Document • Problem • how to import the namespaces?
Solution • Fully-qualified names everywhere • yuk! • DTDs & namespaces don't work well together
XML so far • Languages defined by DTDs • contain text elements • string attributes • OK for text documents • Not enough for • Databases • Business process integration
Other DTD problems • Not XML • different syntax • different processor • No support for namespaces
Solution • Write language definition in XML • Allow more control over document contents • XML document becomes • a complex data type • XML language definition becomes • complex data type specification
XML Schema • Always a separate document • no internal option • Written in XML • very verbose • Can be complex
Schemas and namespaces • A schema • uses elements from one application • the XML Schema language • to define another • Namespaces are necessary • Namespaces apply to elements • not values • Namespace of element assumed to apply to attributes • can have attributes from different namespaces <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
Example 1, XML <grades assignment="Homework 1"> <grade> <student id="1234-12345">Jane Doe</student> <assigned-grade>A</assigned-grade> </grade> <grade> <student id="5432-54321">John Doe</student> <assigned-grade>B</assigned-grade> </grade> </grades>
Example 1, DTD <!ELEMENT grades (grade*)> <!ATTLIST grades assignment CDATA #IMPLIED> <!ELEMENT grade (student, assigned-grade)> <!ELEMENT student (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED> <!ELEMENT assigned-grade (#PCDATA)>
Data types • grades • a collection of items of type grade • can never have more than 40 students • grade • a structure containing a student and an assigned grade • student • a structure containing an id and some text • probably should constrain the student id • assigned-grade is text • probably should constrain to A-D,F,I
Built-in types • Part of the schema language • Base types • 19 fundamental types • Examples: string, decimal • Derived types • 25 more types that use the base types • Examples: ID, positiveInteger
To declare an element <xs:element name="assigned-grade" type="string"> • Equivalent to <!ELEMENT assigned-grade (#PCDATA)>
Simple data type • A renaming of an existing data type <xs:element name="assigned-grade" type="xs:string"> • Or a restriction of a existing type • strings beginning with "D" • more on this later
Complex datatype <xs:element name=“name”> <xs:complexType> compositor element declarations attribute declarations </xs:complexType> </xs:element>
Compositor • sequence • choice • all
Sequence compositor • like "," in DTD • DTD <!ELEMENT foo (bar, baz)> • Schema <xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:sequence> </xs:complexType> </xs:element>
Elements in sequences • Can specify optional / # of occurrences • ? <xs:element ref="bar" minOccurs="0" type="xs:string"> • * <xs:element ref="bar" minOccurs="0" maxOccurs="unbounded" /> • + <xs:element ref="bar" minOccurs="1" maxOccurs="unbounded" /> • What about... <xs:element ref="bar" minOccurs="2" maxOccurs="4" />
Choice compositor • like "|" in DTD • DTD <!ELEMENT foo (bar | baz)> • Schema <xs:element name="foo"> <xs:complexType> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> </xs:complexType> </xs:element>
All compositor • no simple DTD equivalent • DTD <!ELEMENT foo ( (bar, baz?) | (baz, bar?) > • Schema <xs:element name="foo"> <xs:complexType> <xs:all> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:all> </xs:complexType> </xs:element>
Nesting • Compositors can be combined • DTD <!ELEMENT foo ( (bar, baz) | (thud, grunt) )> • Schema <xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> <xs:choice> <xs:element ref="thud" /> <xs:element ref="grunt" /> </xs:choice> </xs:sequence> </xs:complexType> </xs:element>
Example <!ELEMENT grades (grade*)> <!ELEMENT grade (student, assigned-grade)> <!ELEMENT student (#PCDATA)> <!ELEMENT assigned-grade (#PCDATA)>
Local naming • Suppose we want to reuse an element name • different place in the structure • Example • <!ELEMENT url-catalog (link*)> • <!ELEMENT link (link, description?)> • not a legal DTD • schema?
Using namespaces • Schema must say • to use schema namespace • what namespace it is defining • targetNamespace • Document must say • that it is using the Schema Instance namespace • what namespace(s) it is using • what prefix(es) are used • where to find the relevant schemas
Multi-schema documents • Possible to validate multi-schema documents • Must use any element to import namespace • can't restrict to certain elements
Attributes • DTD attribute types • CDATA, enumeration, token • Schema • can be any of the basic or derived types • can also be user-defined types • Declaration <xs:attribute name="x" type="xs:string" use="required" default="abc" />
Attribute declaration • Part of complex type • follows compositor • (one exception) • Declaration <xs:attribute name="foo" type="xs:positiveInteger" /> • What if the attribute is a more complex type itself? • we'll get to that
Example • grades element? • add homework attribute
Exception: simple content • If an element has "simple content" • no compositor used • instead simpleContent element • and extension to declare type of the content
Example <!ELEMENT student (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED > <xs:element name="student"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>
How to read this • student is a complex type • it is not simply a renaming of an existing type • its content is simple • being of only one type • string • but with an attribute • id of type string which is required
Standalone types • A type can stand outside of an element definition • must have a name <xs:complexType name="bar-n-baz"> <xs:sequence> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:sequence> </xs:complexType> • Used in element definition <xs:element name="foo" type="bar-n-baz" />
Deriving types • DTDs do not allow types restrictions • beyond enumeration, CDATA, token • for attributes • PCDATA • for content • Schemas have built-in types • also capability to create your own
Derivation operations • list • sequence of values • union • combine two types • allowing either • restriction • placing limits on the legal values
List <xs:element name="partList"> <xs:simpleType> <xs:list itemType="partNo" /> </xs:simpleType> </xs:element> <partList>PN334-04 PN223-89 PQ1112-03</partList> • Must be separated by spaces • probably more useful to do this with document structure • partList -> partNo*
Union • Allows data of either type to be used • Example <xs:simpleType name="xs:integer"> <xs:union memberTypes="xs:negativeInteger xs:nonNegativeInteger" /> </xs:simpleType> • Bogus!
Restriction • Most useful • Allow design to state exactly what values are legal • prices must be non-negative • SSN must follow a certain pattern • in-stock must yes or no • etc.
Restriction, cont'd • Restrict a base type • according to "facets" • Different facets available for different data types
Example: enumeration <xs:simpleType name="grade"> <xs:restriction base="xs:string"> <xs:enumeration value="A"/> <xs:enumeration value="B"/> <xs:enumeration value="C"/> <xs:enumeration value="D"/> <xs:enumeration value="F"/> <xs:enumeration value="I"/> </xs:restriction> </xs:simpleType>