540 likes | 657 Views
Tutorial 13 Validating Documents with Schemas. Objectives. Compare schemas and DTDs Explore different schema vocabularies Declare simple type elements and attributes Declare complex type elements Apply a schema to an instance document Work with XML Schema data types
E N D
Objectives • Compare schemas and DTDs • Explore different schema vocabularies • Declare simple type elements and attributes • Declare complex type elements • Apply a schema to an instance document • Work with XML Schema data types • Derive new data types for text strings, numeric values, and dates • Create data types for patterned data using regular expressions New Perspectives on HTML, CSS, and XML 4th edition
The Limits of DTDs • DTDs are commonly used for validation largely because of XML’s origins as an offshoot of SGML. • One complaint about DTDs is their lack of data types. • DTDs also do not recognize namespaces, so they are not well suited to compound documents in which content from several vocabularies needs to be validated. • DTDs employ a syntax called Extended Backus–Naur Form (EBNF), which is different from the syntax used for XML. New Perspectives on HTML, CSS, and XML 4th edition
Schemas and DTDs • A schema is an XML document that contains validation rules for an XML vocabulary. • When applied to a specific XML file, the document to be validated is called the instance document. New Perspectives on HTML, CSS, and XML 4th edition
Schemas and DTDs New Perspectives on HTML, CSS, and XML 4th edition
Schema Vocabularies • A single standard doesn’t exist for schemas. • A schema vocabulary is simply an XML vocabulary created for the purpose of describing schema content. • Support for a particular schema depends solely on the XML parser being used for validation. New Perspectives on HTML, CSS, and XML 4th edition
Schema Vocabularies New Perspectives on HTML, CSS, and XML 4th edition
Starting a Schema File • A schema, is always placed in an external XML file. • XML Schema filenames end with the .xsdfile extension. • The root element in any XML Schema document is the schema element. • The general structure of an XML Schema file is:<?xml version=”1.0” ?><schema xmlns=”http://www.w3.org/2001/XMLSchema”>content</schema> New Perspectives on HTML, CSS, and XML 4th edition
Starting a Schema File • By convention, the namespace prefix xsd or xsis assigned to the XML Schema namespace to identify elements and attributes that belong to the XML Schema vocabulary. • The usual form of an XML Schema document is:<?xml version=”1.0” ?><xs:schemaxmlns=”http://www.w3.org/2001/XMLSchema”>content</xs:schema> New Perspectives on HTML, CSS, and XML 4th edition
Understanding Simple and Complex Types • XML Schema supports two types of content—simple and complex. • A simple type contains only text and no nested elements. • A complex type contains two or more values or elements placed within a defined structure. New Perspectives on HTML, CSS, and XML 4th edition
Understanding Simple and Complex Types New Perspectives on HTML, CSS, and XML 4th edition
Understanding Simple and Complex Types New Perspectives on HTML, CSS, and XML 4th edition
Defining a Simple Type Element • An element in the instance document containing only text and no attributes or child elements is defined in XML Schema using the <xs:element> tag:<xs:element name=”name” type=”type” /> • Here nameis the name of the element in the instance document and typeis the type of data stored in the element. • If you use a different namespace prefix or declare XML Schema as the default namespace for the document, the prefix will be different. New Perspectives on HTML, CSS, and XML 4th edition
Data Types • The data type can be: • one of XML Schema’s built-in data types, • defined by the schema author, or user data type. • The most commonly used data type in XML Schema is string, which allows an element to contain any text string. • Example: <xs:element name=”lastName” type=”xs:string” /> • Another popular data type in XML Schema is decimal, which allows an element to contain a decimal number. New Perspectives on HTML, CSS, and XML 4th edition
Defining a Simple Type Element New Perspectives on HTML, CSS, and XML 4th edition
Defining an Attribute • To define an attribute in XML Schema, you use the <xs:attribute> tag:<xs:attribute name=”name” type=”type” default=”default” fixed=”fixed” /> • Here nameis the name of the attribute,type is the data type, defaultis the attribute’s default value, and fixedis a fixed value for the attribute. • The defaultand fixedattributesare optional. New Perspectives on HTML, CSS, and XML 4th edition
Defining an Attribute New Perspectives on HTML, CSS, and XML 4th edition
Defining a Complex Type Element • The basic structure for defining a complex type element with XML Schema is<xs:elementname=”name”> <xs:complexType>declarations</xs:complexType></xs:element> • Here nameis the name of the element and declarationsrepresents declarations of the type of content within the element. New Perspectives on HTML, CSS, and XML 4th edition
Defining a Complex Type Element • This content could include nested child elements, basictext, attributes, or any combination of the three: • An empty element containing only attributes • An element containing text content and attributes but no child elements • An element containing child elements but no attributes • An element containing both child elements and attributes New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element Containing Attributes and Basic Text • The definition needs to indicate that the element contains simple content and a collection of one or more attributes. The structure of the element definition is:<xs:elementname=”name”> <xs:complexType> <xs:simpleContent> <xs:extension base=”type”>attributes</xs:extension> </xs:simpleContent> </xs:complexType></xs:element> New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element Containing Attributes and Basic Text • Example:<xs:element name=”gpa”> <xs:complexType> <xs:simpleContent> <xs:extension base=”xs:string”> <xs:attribute name=”degree” type=”xs:string” /> </xs:extension> </xs:simpleContent> </xs:complexType></xs:element>The base attribute in the <xs:extension> element sets the data type for thegpaelement. It also sets the data type of the degree attribute to xs:string. New Perspectives on HTML, CSS, and XML 4th edition
Referencing an Element or Attribute Definition • XML Schema allows for a great deal of flexibility in writing complex types. • Rather than repeating that earlier attribute declaration, you can create a reference to it. • A reference to an element definition is<xs:elementref=”elemName” />where elemNameis the name used in the element definition. • A reference to an attribute definition is<xs:attributeref=”attName” />where attNameis the name used in the attribute definition. New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element with Nested Children • Complex elements that contain nested child elements but no attributesor text:<xs:element name=”name”> <xs:complexType> <xs:compositor>elements</xs:compositor> </xs:complexType></xs:element>where nameis the name of the element, compositoris a value that defines how the child elements appear in the document, and elementsis a list of the nested child elements. New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element with Nested Children • The following compositors are supported: • sequence- requires the child elements to appear in the order listed in the schema • choice- allows any one of the child elements listed to appear in the instance document • all- allows any of the child elements to appear in any order in the instance document; however, each may appear only once, or not at all New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element with Nested Children - Examples New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element Containing Nested Elements and Attributes • The code for a complex type element that contains both child elements and attributes is:<xs:element name="name"><xs:complexType> <xs:compositor> elements</xs:compositor> attributes</xs:complexType></xs:element>where nameis the name of the element;compositor is either sequence, choice, or all; elementsis a list of nested child elements; and attributesis a list of attribute definitions. New Perspectives on HTML, CSS, and XML 4th edition
Defining an Element Containing Nested Elements and Attributes • Example: New Perspectives on HTML, CSS, and XML 4th edition
Specifying Mixed Content • An element is said to have mixed content when it contains both a text string and child elements. • XML Schema assumes that the element contains both text and child elements. The structure of the child elements can then be defined with the conventional method. New Perspectives on HTML, CSS, and XML 4th edition
Specifying Mixed Content <summary> student <firstName>Cynthia</firstName> <lastName>Berstein</lastName> is enrolled in an IT degree program and has completed <credits>12</credits> credits since 01/01/2012.</summary> The summary element for this document in a schema file can be declared using the following complex type: <element name=”summary”> <complexType mixed=”true”> <sequence> <element name=”firstName” type=”string” /> <element name=”lastName” type=”string” /> <element name=”credits” type=”string” /> </sequence> </complexType> </element> New Perspectives on HTML, CSS, and XML 4th edition
Indicating Required Attributes • To indicate whether an attribute is required, the use attribute can be added to the statement that assigns the attribute to an element:<xs:element name=”name”> <xs:complexType>element content<xs:attributeproperties use=”use” /> </xs:complexType></xs:element> New Perspectives on HTML, CSS, and XML 4th edition
Indicating Required Attributes • useis one of the following three values: • required -The attribute must always appear with the element. • optional -The use of the attribute is optional with the element. • prohibited -The attribute cannot be used with the element. • Example:<xs:attribute name=”degree” type=”xs:string” use=”required” /> New Perspectives on HTML, CSS, and XML 4th edition
Specifying the Number of Child Elements • To specify the number of times an element appears in the instance document, you can apply the minOccurs and maxOccurs attributes to the element definition:<xs:element name=”name” type=”type” minOccurs=”value” maxOccurs=”value” /> • The value of the minOccurs attribute defines the minimum number of times the element can occur, and the value of the maxOccursattribute defines the maximum number of times the element can occur. New Perspectives on HTML, CSS, and XML 4th edition
Validating a Schema Document New Perspectives on HTML, CSS, and XML 4th edition
Applying a Schema to an Instance Document • To attach a schema to an instance document, you: • Declare the XML Schema instance namespace in the instance document. • Specify the location of the schema file. • To declare the XML Schema instance namespace, you add the following attribute to the root element of the instance document:xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” New Perspectives on HTML, CSS, and XML 4th edition
Applying a Schema to an Instance Document • You add a second attribute to the root element to specify the location of the schemafile. • The attribute you use depends on whether the instance document is associated with a namespace. • If the document is not associated with a namespace, you add the attribute:xsi:noNamespaceSchemaLocation=”schema”to the root element, where schemais the location and name of the schema file. New Perspectives on HTML, CSS, and XML 4th edition
Validating with Built-In Data Types • XML Schema divides its built-in data types into two classes—primitive and derived. • Aprimitive data type, also called a base type, is one of 19 fundamental data types that are not defined in terms of other types. • A derived data type is one of 25 data types that are developed from one of the base types. New Perspectives on HTML, CSS, and XML 4th edition
Validating with Built-In Data Types New Perspectives on HTML, CSS, and XML 4th edition
String Data Types New Perspectives on HTML, CSS, and XML 4th edition
Numeric Data Types New Perspectives on HTML, CSS, and XML 4th edition
Date and Time Data Types New Perspectives on HTML, CSS, and XML 4th edition
Deriving Customized Data Types • The code to derive a new data type is:<xs:simpleType name=”name”>rules</xs:simpleType> • Here nameis the name of the user-defined data type and rulesis the list of statements that define the properties of that data type. • This structure is also known as a named simple type. • You can also create a simple type without a name, which is known as an anonymous simple type. New Perspectives on HTML, CSS, and XML 4th edition
Deriving Customized Data Types • The following three components are involved in deriving any new data type: • value space- The set of values that correspond to the data type. • lexical space- The set of textual representations of the value space. • facets - The properties that distinguish one data type from another. New Perspectives on HTML, CSS, and XML 4th edition
Deriving Customized Data Types • New data types are created by manipulating the properties of value space, lexical space, and facets. • It can be done by: 1. Creating a list based on preexisting data types. 2. Creating a union of one or more of the preexisting data types. 3. Restricting the values of a preexisting data type. New Perspectives on HTML, CSS, and XML 4th edition
Deriving a List Data Type • A list data type is a list of values separated by white space, in which each item in the list is derived from an established data type. • The syntax for deriving a customized list data type is:<xs:simpleType name=”name”> <xs:listitemType=”type” /></xs:simpleType> • Here nameis the name assigned to the list data type andtype is the data type from which each item in the list is derived. New Perspectives on HTML, CSS, and XML 4th edition
Deriving a Union Data Type • A union data type is based on the value and/or lexical spaces from two or more preexisting data types. • Each base data type is known as a member data type. The syntax is:<xs:simpleType name=”name”> <xs:unionmemberTypes=”type1 type2 type3 ...” /></xs:simpleType>where type1, type2, type3, etc., are the member types that constitute the union. New Perspectives on HTML, CSS, and XML 4th edition
Deriving a Union Data Type • XML Schema also allows unions to be created from nested simple types. The syntax is:<xs:simpleType name=”name”> <xs:union> <xs:simpleType>rules1</xs:simpleType> <xs:simpleType>rules2</xs:simpleType> ... </xs:union></xs:simpleType>where rules1, rules2, etc., are rules for creating different user-derived data types. New Perspectives on HTML, CSS, and XML 4th edition
Deriving a Restricted Data Type New Perspectives on HTML, CSS, and XML 4th edition
Constraining Facets • Constraining facets are applied to a base type using the structure:<xs:simpleType name=”name”> <xs:restriction base=”type”> <xs:facet1 value=”value1” /> <xs:facet2 value=”value2” /> ... </xs:restriction></xs:simpleType>where typeis the data type on which the restricted data type is based; facet1, facet2, etc., are constraining facets; and value1, value2, etc., are values for the constraining facets. New Perspectives on HTML, CSS, and XML 4th edition
Deriving Data Types Using Regular Expressions • A regular expression is a text string that defines a character pattern. • Regular expressions can be created to define patterns for many types of data, including phone numbers, postal address codes, and e-mail addresses. New Perspectives on HTML, CSS, and XML 4th edition
Deriving Data Types Using Regular Expressions • To apply a regular expression in a data type, you create the simple type:<xs:simpleTypename=”name”> <xs:restriction base=”type”> <xs:pattern value=”regex” /> </xs:restriction></xs:simpleType>where regexis a regular expression pattern. • Example:<xs:pattern value=”ABC” /> New Perspectives on HTML, CSS, and XML 4th edition