Schemas

Schemas Deitel XML chapter 7 Peltzer, XML Language Mechanics and Applications (Addison Wesley) Chapter 4 – has much more on W3C schemas

Schemas vs DTD • DTDs are inherited from SGML (Standard Generalized Markup Language). • They can’t be manipulated (searched or transformed into another format, like HTML) the way XML documents can, because DTDs are not XML. • Schemas are XML. XML documents conforming to schema require validating parsers like DTDs do. • Schemas themselves conform to DTDs which are bundled with the parser. • Repositories of existing DTDs and Schema are available for download free.

Schemas • DTDs define document structure, not content, so although <value>5</value> contains legal PCDATA, it can’t be checked to insure the content is numeric. • Markup like <value>Hello Bob</value> would also be valid PCDATA. The application using this XML document would itself have to test if value were numeric and take appropriate action if the test failed. • Schemas are XML documents conforming to DTDs and must be validated to be processed. Schema do not use EBNF but use XML syntax. • Schema can be manipulated (eg., searched, or elements added or removed) as with any XML document. • W3C XML Schema are not covered (much) in Deitel’s book, only MS Schema. Many W3C examples of schema are in Peltzer, XML Language Mechanics and Applications (Addison Wesley)

Schemas • Schemas view xml docs as a collection of datatypes • DTDs view xml docs as a single entitity • W3C 2001 schema specification lists 44 datatypes. • There are 19 primitive types and 25 built-in, derived types. • User derived and built-in types are both defined using the simpleType definitions, which restrict the type of data that can appear as content for an attribute value or text-only element. • A schema datatype has 3 components: a value space, a lexical space, a set of facets.

Examples • In <book> learning XLM </book> the value space and lexical space are both string. (the ‘value’ is the literal string “learning XML” and lexical space is the type string). • In <number>123</number> the value space is a set of literals (digits), the lexical representation might be a specified number of digit chars.

A simple schema for Author (saved as .xsd) <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Author"> <xs:annotation> <xs:documentation> </xs:documentation> </xs:annotation> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string"/> <xs:element name="Address" type="xs:string"/> <xs:element name="City" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="Zip" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

In validator

And an instance of the Author schema <?xml version="1.0" encoding="UTF-8"?> <Author> <Name>Dwight Peltzer</Name> <Address>Po Box 555</Address> <City>Oyster Bay</City> <State>NY</State> <Zip>11771</Zip> </Author>

Elements make up XML documents • in MS Schema, ElementType defines an element. It contains attributes describing content, data type, name and so on for this element. • MSXML (microsoft’s validating parser) is part of IE5, and is needed to build MS Schema. • Element Schema is the root element for every MS Schema document.

facets • Facets define the value space and properties for a specified data type. They consist of two types: fundamental and non-fundamental facets. • fundamental facets define a type • non-fundamental facets impose restrictions on the type by limiting the range

fundamental facets • Equal – allows comparison • Ordered- allows words to be placed in a predefined ordering • Bounded- allows a lower and upper limit to be provided • Cardinality- defines numeric relationship between occurrences of an entity (as in minOccurs=“0” maxOccurs=“unbounded” ) Recall, we used the “+” for this in DTDs. • Numeric – a value can be classified as numeric or nonnumeric as in numeric(value=“true”) or numeric(value=“false”) • Example…we might define isbn to consist of exactly 10 digit chars: <xs:simpleType name =“isbnType” <xs:restriction base=“xs:string”> xs:pattern value=“[0-9]{10}”/> </xs:restriction> </xs:simpleType> • Note- this is not precisely the way isbns are defined, since the last character might be alpha and provides a parity check

Euros…up to 10 decimal digits and exactly 2 decimal places <xs:element name =“Euros”> <xs:simpleType name =“EuroDollarType” <xs:restriction base=“decimal”> <xs:totalDigits value=“10”/> <xs:fractionDigits value=“2”/> </xs:restriction> </xs:simpleType> </xs:element> A document instance <Euros>55.63</Euros>

Derived user types: use simpleType definitions and one of 3 methods: restriction, list and Union • Restriction uses one or more constraining facets to restrict the value or lexical space for the base type. A postal code might use: <xsd:simpleType name =“zipType”> <xsd:restriction base=“xsd:string”> <xsd:pattern value =“[0-9]{9}”/> </xsd:restriction> </xsd:simpleType>

Derived user types: list • List uses a predetermined itemType sequence of attributes to derive a new type. A “whitespace-delimited” list of decimal values for some lottery might be <?xml version=‘1.0’?> <xs:schema xmlns:xs=“http://www.w3.org/2001/XMLSchema”> <xs:simpleType name=“MyWinningNumbers”> <xs:list itemType name=“decimal”/> </xs:simpleType> </xs:schema> With a document instance of <numbers xsi:type=“MyWinningNumbers”>94 33 12 76</numbers>

Text version is not validated

Derived user types:union • Union creates a datatype derived from more than one base type. A number of basetypes participate in the union. <xsd:simpleType name=“UnionDemo”> <xsd:union memberTypes=“AType BType”/> </xsd:simpleType> Here the two types could be any base types. This would enable using eg., string or int to define a month as in: <Month>Jan Feb Mar</Month> <Month>1 2 3</Month>

A complex type xsd (see next slide for discussion of sequence) <xs:complexType> <xs:sequence> <xs:element name="Author" type="xs:string"/> <xs:element name="Name" type="xs:string"/> <xs:element name="Address" type="xs:string"/> <xs:element name="City" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="Zip" type="xs:string"/> </xs:sequence> </xs:complexType>

Compositors • Allow us to specify: • Sequential order of elements • Choice of elements • The ALL compositor allowing no restrictions for order and selection • The previous slide used sequence.

Choice • We might use, as part of a schema: <xs:choice> <xs:element name=“creditcard” type=“xs:string”/> <xs:element name=“cash” type=“xs:decimal”/> <xs:element name=“trade” type=“xs:string”/> <xs:choice>

ALL • ALL is similar to ANY <xs:element name="FamilyName"> <xs:complexType> <xs:all> < xs:element name="firstName" type="xs:string"/> <xs:element name="middleName" type="xs:string"/> <xs:element name="lastName" type="xs:string" minOccurs="0"/> </xs:all> </xs:complexType> </xs:element>

namespaces • Xsd and xs are used interchangeably. • Xs serves as the default prefix for all XSD schemas. • There are 3 distinct namespaces • The XML schema namespace • The XML schema data type namespace • The XML schema instance namespace • An example of the first appeared above as: <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>

Global elements- anything declared before complexType is global (see below) <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" > <xs:element name="Name" type="xs:string"/> <xs:element name="Address" type="xs:string"/> <xs:element name="Author"> <!- -this is where global declarations stop - -> <xs:complexType> <xs:sequence> <xs:element name="City" type="xs:string"/> <xs:element name="State" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

Global elements-document instance <Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="global.xsd"> <City>String</City> <State>String</State> </Author>

Using a target namespace for your document --- this binds the document to the schema <?xml version="1.0" encoding="UTF-8"?> <Author xmlns:xs="http://employees.oneonta.edu/higgindm/Authors"> <Name>Dwight Peltzer</Name> <Address>PO Box 555</Address> <City>Oyster Bay</City> <State>NY</State> <Zip>11771</Zip> <Publisher> <Name>Addison Wesley</Name> <City>Boston</City> <State>Massachusetts</State> </Publisher> <BookTitle>XML Language Mechanics</BookTitle> <ISBN>0-1-23458-0</ISBN> </Author>

target namespace for your document may be omitted. This means you are using built-in types and mapping all elements/attributes to the default namespace. You are then prevented from reusing locally declared elements. <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”> <xs:element name="Author"> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string"/> <xs:element name="Address" type="xs:string"/> <xs:element name="City" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="Zip" type="xs:short"/> <xs:element name="Publisher" type="xs:string"/> <xs:element name="BookTitle" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

In validator

The document instance would not then have a namespace <?xml version="1.0" encoding="UTF-8"?> <Author xmlns:xsi=http://www.w3.org.2001.XMLSchema-instance" xsi:noNamespaceSchemaLocation="Author.xsd"> <Name>Dwight Peltzer</Name> <Address>PO Box 555</Address> <City>Oyster Bay</City> <State>NY</State> <Zip>11771</Zip> <Publisher> <Name>Addison Wesley</Name> <City>Boston</City> <State>Massachusetts</State> </Publisher> <BookTitle>XML Language Mechanics</BookTitle> <ISBN>0-1-23458-0</ISBN> </Author>

Adding the target namespace to the schema root defines a namespace for your user-defined declarations <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.dpsoftware.com/namespaces/Author" xmlns="http://www.dpsoftware.com/namespaces/Author"> <xs:element name="Author"> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string"/> <xs:element name="Address" type="xs:string"/> <xs:element name="City" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="Zip" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> <xs:element name="BookTitle" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

An example document <?xml version="1.0" encoding="UTF-8"?> <dp:Author xmlns:dp="http://www.dpsoftware.com/namespaces/Author" xmlns:xsi="http://www.w3.org/20011/XMLSchema-instance" xsi:schemaLocation="http://www.dpsoftware.com/namespaces/author/AuthorV1.xsd"> <Name>Dwight Peltzer</Name> <Address>Po Box 555</Address> <City>Oyster Bay</City> <State>NY</State> <Zip>11771</Zip> <Publisher>Addison Wesley</Publisher> <BookTitle>XML Language Mechanics</BookTitle> <ISBN>0-1-23458-0</ISBN> </dp:Author>

W:\internet programming\validate_js.htm

Namespace prefix can be used to qualify each element in a doc <?xml version="1.0" encoding="UTF-8"?> <dp:Author xmlns:dp="http://www.dpsoftware.com/namespaces/author" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:SchemaLocation="http://www.dpsoftware.com/namespaces/author Author.xsd"> <dp:Name>Dwight Peltzer</dp:Name> <dp:Address>PO Box 555</dp:Address> <dp:City>Oyster Bay</dp:City> <dp:State>NY</dp:State> <dp:Zip>11771</dp:Zip> <dp:BookTitle>XML Language Mechanics</dp:BookTitle> <dp:ISBN>0-1-23458-0</dp:ISBN> </dp:Author>

Russian doll model <Book> <Title>XML</Title> <Author>Dwight Peltzer</Author> </Book> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

Salami slice model <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Book"> <xs:complexType> <xs:sequence><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element ref="Title"/> <xs:element ref="Author"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <!- -reassemble Title and Author - -> <xs:element ref="Title"/> <xs:element ref="Author"/> </xs:sequence> </xs:complexType> </xs:element>

Venetian blind model • Venetian blind model uses elementFormDefault and attributeFormDefault to switch back and forth (hiding/exposing namespaces) in the document instance <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="Employer"> <xs:annotation> <xs:documentation>Comment describing your root element</xs:documentation> </xs:annotation> </xs:element> <xs:complexType name="employeeType"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="contact" type="xs:string"/> </xs:sequence> </xs:complexType> <xs:complexType name="employeeTypeExt"> <xs:complexContent> <xs:extension base="employeeType"> <xs:sequence> <xs:element name="empName" type="employeeType"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <xs:element name="employee" type="employeeTypeExt"/> </xs:schema>

All NamedType components in this xsd are reusable <xs:simpleType name="Title"> <xs:restriction base="xs:string"> <xs:enumeration value="Sci_Fi"/> <xs:enumeration value="Information Systems"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="Name"> <xs:restriction base="xs:string"> <xs:minLength value="1"/> </xs:restriction> </xs:simpleType> <xs:complexType name="Editor"> <xs:sequence> <xs:element name="Title" type="Title"/> <xs:element name="Author" type="Editor"/> </xs:complexType> <xs:element name="Book" type="Editor"/>

ContentModel template- use type attribute to reference the named complex type definition <xs:complexType name="nameType"> <xs:sequence> <xs:element ref="firstName"/> <xs:element ref="middleName"/> <xs:element ref="lastName"/> </xs:sequence> /xs:complexType>

An ms schema <?xml version = "1.0"?>   <Schema xmlns = "urn:schemas-microsoft-com:xml-data"> <ElementType name = "message" content = "textOnly" model = "closed"> <description>Text messages</description> </ElementType> <ElementType name = "greeting" model = "closed" content = "mixed" order = "many"> <element type = "message"/> </ElementType> <ElementType name = "myMessage" model = "closed" content = "eltOnly" order = "seq"> <element type = "greeting" minOccurs = "0" maxOccurs = "1"/> <element type = "message" minOccurs = "1" maxOccurs = "*"/> </ElementType> </Schema>

schema elements • xmlns specifies the default namespace for the Schema element and the elements it contains. • Attribute value urn:... is the uri for this namespace. • Microsoft’s xml parser recognizes element Schema and this namespace and validates the schema. • Element Schema can contain only elements of ElementType for defining elements, AttributeType for their attributes and description for describing the element. • This example specifies that element message may contain textOnly. • The closed model attribute specifies that only elements declared in this schema may appear in conforming xml documents, anything else would invalidate the document. • Element greeting has mixed content, indicating that both elements and character data may appear here. Order =“many” indicates that any number of message elements and text may be contained in the greeting.

a conforming xml document <?xml version = "1.0"?>   <myMessage xmlns = "x-schema:intro-schema.xml"> <greeting>Welcome to XML Schema! <message>This is the first message.</message> </greeting> <message>This is the second message.</message> </myMessage>

msxml validator

a well-formed but non-conforming xml document <?xml version = "1.0"?>   <myMessage xmlns = "x-schema:intro-schema.xml"> <greeting>Welcome to XML Schema!</greeting> <message>This is a message that contains another message. <message>This is the inner message.</message> </message> </myMessage>

using validator

Namespaces and declaring schema <myMessage xmlns = "x-schema:intro-schema.xml"> • The namespace declaration xmlns=“…” references the schema being used. • For MS Schema, the URI must begin with x-schema followed by a colon and the name of the schema document. • Element greeting may have mixed content and in this example greeting marks up text and has a child message element.

Element attributes • ElementType has attributes: content, dt:type, name, model and order. • Element ElementType’s child elements are: description, datatype, element, group, AttributeType and attribute. • Element element has attributes type, minOccurs, maxOccurs. • Element group has attributes order, minOccurs, maxOccurs. • Element AttributeType has attributes: default, dt:type, dt:values, name and required. • Element attribute has attributes: default, type, required.

An example of AttributeType and attribute <?xml version = "1.0"?>   <Schema xmlns = "urn:schemas-microsoft-com:xml-data"> <ElementType name = "contact" content = "eltOnly" order = "seq" model = "closed"> <AttributeType name = "owner" required = "yes"/> <attribute type = "owner"/> <element type = "name"/> <element type = "address1"/> <element type = "address2" minOccurs = "0" maxOccurs = "1"/> <element type = "city"/> <element type = "state"/> <element type = "zip"/> <element type = "phone" minOccurs = "0" maxOccurs = "*"/> </ElementType>

An example of AttributeType and attribute (part2) <ElementType name = "name" content = "textOnly" model = "closed"/> <ElementType name = "address1" content = "textOnly" model = "closed"/> <ElementType name = "address2" content = "textOnly" model = "closed"/> <ElementType name = "city" content = "textOnly" model = "closed"/> <ElementType name = "state" content = "textOnly" model = "closed"/> <ElementType name = "zip" content = "textOnly" model = "closed"/> <ElementType name = "phone" content = "textOnly" model = "closed"> <AttributeType name = "location" default = "home"/> <attribute type = "location"/> </ElementType> </Schema>

A conforming xml document <?xml version = "1.0"?>   <contact owner = "Bob Smith" xmlns = "x-schema:contact-schema.xml"> <name>Jane Doe</name> <address1>123 Main St.</address1> <city>Sometown</city> <state>Somestate</state> <zip>12345</zip> <phone>617-555-1234</phone> <phone location = "work">978-555-4321</phone> </contact>

Contact.xml in validator

MS Schema datatypes • DTD did not permit the specification of allowable datatypes (content) an element or attribute might contain. • Namespace prefix dt is defined by the document author and assigned to urn:schemas-microsoft-com:datatypes • Msdn.microsoft.com/xml/reference/schema/datatypes.asp has a complete list of types supported.

MS Schema datatypes • boolean: 0 or 1 • char: a character, “X” • string: a sequence of char as in “XYZ” • float and int: as in C or Java • date: YYYY-MM-DD • time:HH:MM:SS • id: text which uniquely identifies an element or its attribute. • idref: a reference to an id. • enumeration: a series of values from which one is chosen.

Schemas

Schemas

Presentation Transcript

2.5 XML Schemas

Chapter 7 – Schemas

Schemas in Z

Schemas & Research

Z Schemas

WFD 2015 Schemas

XML Schemas

Schemas

Schemas

XML Schemas

XML Schemas

XML Schemas

XML Schemas

Translating Relational Schemas to XML Schemas

XML Schemas

DSS Schemas

XML Schemas

Political Schemas

Schemas

Schemas

Presentation Transcript

2.5 XML Schemas

Chapter 7 – Schemas

Schemas in Z

Schemas &amp; Research

Z Schemas

WFD 2015 Schemas

XML Schemas

Schemas

Schemas

XML Schemas

XML Schemas

XML Schemas

XML Schemas

Translating Relational Schemas to XML Schemas

XML Schemas

DSS Schemas

XML Schemas

Political Schemas

Schemas & Research