1 / 51

Understanding XML Schema for Data Validation

Learn how XML Schema simplifies data validation, defines data types, and ensures well-formed XML documents. Discover the advantages over DTDs and the extensibility of XML Schema.

memory
Download Presentation

Understanding XML Schema for Data Validation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Schema Languages 2/2 Dongwon Lee, Ph.D. The Pennsylvania State University IST 516 / Fall 2011 http://www.practicingsafetechs.com/TechsV1/XMLSchemas/

  2. XML Schema • New XML schema language from W3C • Successor of DTD • Unlike DTD, XML Schema is in XML syntax • http://www.w3.org/XML/Schema <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType>

  3. XML Schema vs. DTD: What’s New • XML Schemas are extensible to future additions • XML Schema V 1.0  1.1  … • XML Schemas are richer and more powerful than DTDs • XML Schemas are written in XML • No <!ELEMENT …> or <!ATTLIST ..> notation • XML Schemas support data types • XML Schemas support namespaces

  4. New: Data Types • XML Schema support data types. Easier to: • Describe allowable document content • Validate the correctness of data • Work with data from a database • Define data facets (restrictions on data) • Define data patterns (data formats) • Convert data between different data types • Eg, <date type="date">2001-09-11</date> • Ensures a mutual understanding of the content • The XML data type "date" requires the format “YYYY-MM-DD”

  5. New: in XML Notation • XML Schema uses XML notation • <> and </> • XML Schema file itself IS an XML file, too • No need to learn a new language • No need to use new tools • Use an XML editor to edit XML Schema files • Use XML parser to parse XML Schema files • Manipulate an XML Schema using DOM • Transform an XML Schema with XSLT

  6. New: Extensibility • XML Schema is extensible because XML is extensible • XML Schema lets you: • Reuse your schema in other schemas • Create your own data types derived from the standard types  Inheritance • Reference multiple schemas in the same document

  7. Well-Formed: Not Enough • Well-Formed: a document conforms to XML syntax rules such as: • Begin with XML decl. • One unique root • Case-sensitive • Matching Start / End tags • Properly nested • Well-formed documents can still contain semantic errors or inconsistencies •  Need VALID documents according to schema

  8. Main Features • XML Schema defines elements • Simple elements: • contains only “text” • No sub-elements or attributes • “text” can be of different types • Types from XML schema built-in • Eg, boolean, string, date • User-defined types • Can add restrictions (facets) to a data type to limit its content

  9. Simple Element • <xs:element name="xxx" type="yyy"/> • “xxx”: the name of the element • “yyy”: the data type of the element • Common built-in types in XML Schema: • xs:string • xs:decimal • xs:integer • xs:boolean • xs:date • xs:time Namespace as in: xmlns:xs="http://www.w3.org/2001/XMLSchema”

  10. Simple Element • Some simple XML elements: <lastname>Lee</lastname> <age>2</age> <dateborn>2009-03-27</dateborn> • Corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

  11. Simple Element • Simple elements may have a default value OR a fixed value specified • Default value is automatically assigned to the element when no other value is specified <xs:element name="color" type="xs:string" default="red"/> • Fixed value is also automatically assigned to the element, and one cannot specify another value <xs:element name=”nationality" type="xs:string" fixed=”USA"/>

  12. <xs:attribute> • The syntax for defining an attribute is: <xs:attribute name="xxx" type="yyy"/> • Where xxx is the name of the attribute and yyy specifies the data type of the attribute. • Simple elements cannot have attributes since they are SIMPLE

  13. <xs:attribute> • An XML element with an attribute: <lastname lang="EN">Smith</lastname> • Corresponding attribute definition: <xs:attribute name="lang" type="xs:string"/> • Attributes can have default or fixed values. If the attribute is required, add use=“required”

  14. Conforming to Types • When an XML element or attribute has a data type defined, it puts restrictions on the element's or attribute's content • If an XML element is of type "xs:date" and contains a string like "Hello World", the element will not validate • With XML Schemas, you can also add your own restrictions to your XML elements and attributes

  15. Constraining User-Defined Types • Defines an element called "age" with a restriction • The value of age cannot be lower than 0 or greater than 120 <xs:element name="age"> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/> </xs:restriction> </xs:simpleType> </xs:element>

  16. Constraining User-Defined Types • Defines an element called "car" with a restriction • The only acceptable values are: Audi, Golf, BMW: <xs:element name="car" type="carType"/> <xs:simpleType name="carType"> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType> • Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element.

  17. Complex Element • What is a Complex Element? • A complex element is an XML element that contains other elements and/or attributes • There are four kinds of complex elements: • Empty elements • Elements that contain only other elements • Elements that contain only text • Elements that contain both other elements and text • Note: Each of these elements may contain attributes as well!

  18. Complex Element: Type 1 • A complex XML element, "product", which has an empty content model: <product pid="1345"/>

  19. Complex Element: Type 2 • A complex XML element, "employee", which contains only other elements: <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

  20. Complex Element: Type 3 • A complex XML element, "food", which contains only text: <food type="dessert">Ice cream</food>

  21. Complex Element: Type 4 • A complex XML element, "description", which contains both elements and text: <description> It happened on <date lang="norwegian">03.03.99</date> .... </description>

  22. Eg, Define a Complex Element • Type 2: element with only sub-elements <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

  23. Eg, Define a Complex Element • Method 1: no re-use foreseen <xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname“ type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>

  24. Eg, Define a Complex Element • Method 2: can reuse “myInfo” type <xs:element name="employee”type=“myInfo”> <xs:complexType name=“myInfo”> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname“ type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>

  25. Eg, Define a Complex Element • Method 2: 3 elements can reuse “myInfo” type <xs:element name="employee" type="myInfo"/> <xs:element name="student" type="myInfo"/> <xs:element name="member" type="myInfo"/> <xs:complexType name="myInfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

  26. Indicators • Order • <xs:all>: in any order, occur zero or once • <xs:choice>: either A or B occur • <xs:sequence>: appear in a specific order • Occurrence • maxOccurs • minOccurs • Group • Group Name • attributeGroup Name

  27. Eg, <xs:all> <firstname> and <lastname> can appear in ANY order but MUST appear ONCE <firstname> and <lastname> can appear in ANY order and can appear ZERO or ONCE

  28. <xs:sequence>: family.xml <?xml version="1.0" encoding="ISO-8859-1"?><persons xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="family.xsd” <person>  <full_name>Hege Refsnes</full_name>  <child_name>Cecilie</child_name></person><person>  <full_name>Tove Refsnes</full_name>  <child_name>Hege</child_name>  <child_name>Stale</child_name>  <child_name>Jim</child_name>  <child_name>Borge</child_name></person><person>  <full_name>Stale Refsnes</full_name></person></persons>

  29. <xs:sequence>: family.xsd <?xml version="1.0" encoding="ISO-8859-1"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"elementFormDefault="qualified"><xs:element name="persons">  <xs:complexType>    <xs:sequence>      <xs:element name="person" maxOccurs="unbounded">        <xs:complexType>          <xs:sequence>            <xs:element name="full_name" type="xs:string"/>            <xs:element name="child_name" type="xs:string"            minOccurs="0" maxOccurs="5"/>          </xs:sequence>        </xs:complexType>      </xs:element>    </xs:sequence>  </xs:complexType></xs:element></xs:schema>

  30. DTD vs. XML Schema <!ELEMENT e1 ((e2,e3?)+|e4)> <element name=“e1”> <complexType> <choice> <sequence maxOccurs=“unbounded”> <element ref=“e2”/> <element ref=“e3” minOccurs=“0”/> </sequence> <element ref=“e4”> </choice> </complexType> </element>

  31. note.dtd <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

  32. note.xsd <?xml version="1.0"?> <xs:schema xmlns:xs= “http://www.w3.org/2001/XMLSchema” targetNamespace= “http://pike.psu.edu” xmlns= “http://pike.psu.edu” elementFormDefault= "qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

  33. <schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://pike.psu.edu” xmlns = “http://pike.psu.edu” elementFormDefault= "qualified"> . . . </xs:schema> • <schema> element is the root element of every XML Schema

  34. <schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://pike.psu.edu” xmlns = “http://pike.psu.edu” elementFormDefault= "qualified"> . . . </xs:schema> • Some built-in elements & data types in this schema file come from http://www.w3.org/2001/XMLSchemanamespace, defined by W3C folks • They are to be prefixed with “xs:” • Eg, <xs:schema>

  35. <schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://pike.psu.edu” xmlns = “http://pike.psu.edu” elementFormDefault= "qualified"> . . . </xs:schema> • Indicates that the elements being defined by this schema (eg, note, to, from, heading, body.) are BOUND to this SYMBOLIC namespace • http://pike.psu.edu • Such URL may not correspond to actual URL

  36. <schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://pike.psu.edu” xmlns = “http://pike.psu.edu” elementFormDefault= "qualified"> . . . </xs:schema> • Default namespace • Unqualified elements (ie, w/o prefix) are assumed from this default namespace: http://pike.psu.edu

  37. <schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://pike.psu.edu” xmlns = “http://pike.psu.edu” elementFormDefault= "qualified"> . . . </xs:schema> • By default, locally-declared elements do not need to be qualified • To change this: elementFormDefault=“qualified” • Now, even locally-declared elements need to add prefix

  38. note.xml with Reference to XML Schema • <?xml version="1.0"?> • <notexmlns="http://pike.psu.edu"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation= • "http://pike.psu.edu note.xsd”> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend!</body> • </note>

  39. note.xml with Reference to XML Schema • <?xml version="1.0"?> • <notexmlns="http://pike.psu.edu"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://pike.psu.edu note.xsd”> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend!</body> • </note> • Default namespace for the “note.xml” file • Tell schema validator that all the elements used in “note.xml” file are declared in this namespace

  40. note.xml with Reference to XML Schema • <?xml version="1.0"?> • <notexmlns="http://pike.psu.edu"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation= • "http://pike.psu.edu note.xsd”> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend!</body> • </note> • Once the XML Schema Instance namespace is available  Then, one can use schemaLocation attribute in the next line

  41. note.xml with Reference to XML Schema NOTE Space here as the delimiter • <?xml version="1.0"?> • <notexmlns="http://pike.psu.edu"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation= • "http://pike.psu.edu note.xsd”> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend!</body> • </note> • schemaLocation needs two inputs • First value: the namespace to use • Second value: the location of the XML schema to use for that namespace. Eg, • Relative: note.xsd • Absolute: http://pike.psu.edu/foo/bar/note.xsd

  42. note.xml with Reference to XML Schema (alternative) • <?xml version="1.0"?> • <notexmlns="http://pike.psu.edu"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation=“note.xsd”> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend!</body> • </note> • noNamespaceSchemaLocation requires just one input for the location of the XML schema to use for that namespace. Eg, • note.xsd • http://pike.psu.edu/foo/bar/note.xsd

  43. Eg: Multiple References in xsi:schemaLocation • schemaLocation can take PAIRS of two inputs (namespace, location) • <?xml version="1.0"?> • <webster • xmlns:A=“http://www.webster.com/author” • xmlns:B=“http://www.webster.com/book” • xmlns=“http://pike.psu.edu”xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation= • "http://www.webster.com/author author.xsd • http://www.webster.com/book book.xsd”> • <A:author><A:title>Associate Prof</A:title></> • <B:book><B:title>Gone with the wind</B:title></ • </webster>

  44. Common Errors • schemaLocation in an XML file requires TWO inputs and delimiter in-between • xsi:schemaLocation="http://pike.psu.edu note.xsd” … • targetSpace used in both XML and XSD files must be EXACTLY identical • The following minor discrepancy of the “/” at the end could trigger an error • In XML file: xmlns=http://pike.psu.edu/ • In XSD file: xmlns=http://pike.psu.edu

  45. XMLPad Example

  46. XMLPad Example

  47. Schema Validation http://www.w3.org/2001/03/webdata/xsv

  48. Schema Validation: DTD

  49. Schema Validation: XML Schema

  50. Lab #1 (DUE: Sep. 18 11:55PM) • https://online.ist.psu.edu/ist516/labs • Tasks • Given XML files, infer DTD and XML Schema • Validate them using W3C’s schema validator • Accessible from the Web • Turn-In • DTD and XML Schema files • Screenshots showing validation succeeded

More Related