490 likes | 505 Views
XML Validation II Advanced DTDs + Schemas. Robin Burke ECT 360. Outline. Parameter entities Parameterized DTDs Modularized DTDs Break Namespaces XML Schemas Elements Attributes. Entities. Internal general entities macros External general entities "include" mechanism Parameter DTDs
E N D
XML Validation IIAdvanced DTDs + Schemas Robin Burke ECT 360
Outline • Parameter entities • Parameterized DTDs • Modularized DTDs • Break • Namespaces • XML Schemas • Elements • Attributes
Entities • Internal general entities • macros • External general entities • "include" mechanism • Parameter DTDs • for DTD organization
Internal general entities • Declaration <!ENTITY disclaimer "This is a work of fiction. Any resemblance to persons living or dead is unintentional."> • Use &disclaimer;
External general entities • Declaration <!ENTITY standardContract SYSTEM "stdContract.xml"> • Use <comment>... transaction subject to the following contract terms: &standardContract;</comment>
General entities • Process model • The entity string is replaced by the contents • And then parsed by the XML parser • Resulting document tree • same as if the text were part of the original document
Parameter entities • Similar macro function • Inside the DTD • Uses • Organizing common element / attribute models • Documenting data types • Simulating namespaces • Making DTDs modular
Conditional sections • To control the inclusion of DTD sections • This will be processed • <![INCLUDE[ ... ]]> • This will be omitted • <!IGNORE[ .. ]]> • Doesn't seem that useful BUT • combined with entities • we can modularize our DTD
Modular DTD • Same DTD can be used for multiple document types • A few parameters are redefined and the DTD is different
Parameter entities • Powerful facility for DTD organization and application • Real-world applications • parameterize DTDs • use modular DTDs
XML so far • Languages defined by DTDs • names assigned by designers • OK for standalone systems • Doesn't have • The ability to handle naming conflicts • The ability to partition work among different developers
Namespaces • A way to identify a set of labels • element / attribute names • attribute values • As belonging to a particular application • Example • course "title" • html "title"
Namespace idea • Associate a short prefix with an application • Schema or DTD • Use the prefix with a colon to "qualify" names • html:title • syll:title
Namespace idea, cont'd • A namespace is an association between • a set of names • a unique identifier (URI) • a prefix used to identify them
Namespace declaration • Standalone <?xml:namespace ns="http://bookpeople.com/book" prefix="book"?> • Part of element <html xmlns="http://www.w3.org/1999/xhtml"> • in this case, no prefix <book xmlns:book="http://bookpeople.com/book">
Namespace URI • Not a URL • there is no resource at the given location • just a unique identifier • URL-like identifiers are good • associated with an organization • must be unique on the Internet • RDDL
Example • DTDs • Document • Problem • how to import the namespaces?
Solution • Fully-qualified names everywhere • yuk! • Parameterized DTD • with namespace defined in an entity
XML so far • Languages defined by DTDs • contain text elements • string attributes • OK for text documents • Not enough for • Databases • Business process integration • Need data types
Solution • Write language definition in XML • Allow more control over document contents • XML document becomes • a complex data type • XML language definition becomes • complex data type specification
XML Schema • Always a separate document • no internal option • Written in XML • very verbose • Can be large and complex
Schemas and namespaces • A schema • uses elements from one application • the XML Schema language • to define another • Namespaces are necessary • Namespaces apply to elements • not values • Namespace of element assumed to apply to attributes • can have attributes from different namespaces <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
Example 1, XML <grades assignment="Homework 1"> <grade> <student id="1234-12345">Jane Doe</student> <assigned-grade>A</assigned-grade> </grade> <grade> <student id="5432-54321">John Doe</student> <assigned-grade>B</assigned-grade> </grade> </grades>
Example 1, DTD <!ELEMENT grades (grade*)> <!ATTLIST grades assignment CDATA #IMPLIED> <!ELEMENT grade (student, assigned-grade)> <!ELEMENT student (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED> <!ELEMENT assigned-grade (#PCDATA)>
Data types • grades • a collection of items of type grade • can never have more than 40 students • grade • a structure containing a student and an assigned grade • student • a structure containing an id and some text • probably should constrain the student id • assigned-grade is text • probably should constrain to A-D,F,I
Built-in types • Part of the schema language • Base types • 19 fundamental types • Examples: string, decimal • Derived types • 25 more types that use the base types • Examples: ID, positiveInteger
To declare an element <xs:element name="assigned-grade" type="string"> • Equivalent to <!ELEMENT assigned-grade (#PCDATA)>
Simple data type • A renaming of an existing data type <xs:element name="assigned-grade" type="xs:string"> • Or a restriction of a existing type • strings beginning with "D" • more on this next week
Complex datatype <xs:element name=“name”> <xs:complexType> compositor element declarations attribute declarations </xs:complexType> </xs:element>
Compositor • sequence • choice • all
Sequence compositor • like "," in DTD • DTD <!ELEMENT foo (bar, baz)> • Schema <xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:sequence> </xs:complexType> </xs:element>
Elements in sequences • Can specify optional / # of occurrences • ? <xs:element ref="bar" minOccurs="0" type="xs:string"> • * <xs:element ref="bar" minOccurs="0" maxOccurs="unbounded" /> • + <xs:element ref="bar" minOccurs="1" maxOccurs="unbounded" /> • What about... <xs:element ref="bar" minOccurs="2" maxOccurs="4" />
Choice compositor • like "|" in DTD • DTD <!ELEMENT foo (bar | baz)> • Schema <xs:element name="foo"> <xs:complexType> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> </xs:complexType> </xs:element>
All compositor • no simple DTD equivalent • DTD <!ELEMENT foo ( (bar, baz?) | (baz, bar?) > • Schema <xs:element name="foo"> <xs:complexType> <xs:all> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:all> </xs:complexType> </xs:element>
Nesting • Compositors can be combined • DTD <!ELEMENT foo ( (bar, baz) | (thud, grunt) )> • Schema <xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:choice> <xs:element ref="bar" /> <xs:element ref="baz" /> </xs:choice> <xs:choice> <xs:element ref="thud" /> <xs:element ref="grunt" /> </xs:choice> </xs:sequence> </xs:complexType> </xs:element>
Example <!ELEMENT grades (grade*)> <!ELEMENT grade (student, assigned-grade)> <!ELEMENT student (#PCDATA)> <!ELEMENT assigned-grade (#PCDATA)>
Local naming • Suppose we want to reuse an element name • different place in the structure • Example • <!ELEMENT url-catalog (link*)> • <!ELEMENT link (link, description?)> • not a legal DTD • schema?
Using namespaces • Schema must say • to use schema namespace • what namespace it is defining • targetNamespace • Document must say • that it is using the Schema Instance namespace • what namespace(s) it is using • what prefix(es) are used • where to find the relevant schemas
Attributes • DTD attribute types • CDATA, enumeration, token • Schema • can be any of the basic or derived types • can also be user-defined types • Declaration <xs:attribute name="x" type="xs:string" use="required" default="abc" />
Attribute declaration • Part of complex type • follows compositor • (one exception) • Declaration <xs:attribute name="foo" type="xs:positiveInteger" /> • What if the attribute is a more complex type itself? • we'll get to that
Example • grades element? • add homework attribute
Exception: simple content • If an element has "simple content" • no compositor used • instead simpleContent element • and extension to declare type of the content
Example <!ELEMENT student (#PCDATA)> <!ATTLIST student id CDATA #REQUIRED > <xs:element name="student"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>
How to read this • student is a complex type • it is not simply a renaming of an existing type • its content is simple • being of only one type • string • but with an attribute • id of type string which is required
Homework #3 • Write your own schema
Next week • More schemas • RELAX NG