1 / 27

XML

XML. DTD & XML Schema Monica Farrow G30 email : monica@macs.hw.ac.uk. A Complete XML Document. <? XML version ="1.0" encoding="UTF-8"> <!DOCTYPE addresses SYSTEM "http://www.addbook.com/addresses.dtd"> <addresses> <person ssno= “123 4589” > <name> Lisa Simpson </name>

Download Presentation

XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML DTD & XML Schema Monica Farrow G30 email : monica@macs.hw.ac.uk

  2. A Complete XML Document <?XML version ="1.0" encoding="UTF-8"> <!DOCTYPE addresses SYSTEM "http://www.addbook.com/addresses.dtd"> <addresses> <person ssno= “123 4589”> <name>Lisa Simpson</name> <tel> 0131-828 1234 </tel> <tel> 078-4701 7775 </tel> <email> lisa@macs.hw.ac.uk </email> </person> </addresses> Required Optional Link to document defining the XML elements

  3. Defining the structure of an XML file • We can check if an XML file is well-formed • by looking at it, maybe • By loading it into a browser • If well-formed, it will be displayed • However, how can we check that the well-formed file contains the correct elements in the correct quantities? • We need to write a specification for the XML file

  4. Defining the structure of an XML file • There are 2 main alternatives • Document Type Definitions • Original and simple • XML Schema • More versatile and complex • We will look at both • Concentrating on XML Schema

  5. Exactlyonename An attribute Up to 4 tel nos Optionally one email One or more persons Example: An Address Book <person ssn = “4444”> <name> Homer Simpson </name> <tel> 2543 </tel> <tel> 2544 </tel> <email> homer@math.springfield.edu </email> </person>

  6. DTD - Specifying the Structure • In a DTD, we can specify the permitted content for each element, using regular expressions • Describes the pattern • For a person element, the regular expression is • name, title?, tel*,email+

  7. What’s in a person Element? • This means • name= there must be a name element • title? = there is an optional title element (i.e., 0 or 1 title elements) • name, title?= the name element is followed by an optional title element • tel* = there are 0 or more telelements • email+= there are 1 or more email elements

  8. Regular expressions DTD For the Address Book <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE addressbook [ <!ELEMENT addressbook (person*)> <!ELEMENT person (name, title?, tel*, email+)> <!ELEMENT name (#PCDATA)> <!ELEMENT title (#PCDATA)> <!ELEMENT tel (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ATTLIST person ssn CDATA REQUIRED> ]> PCDATA means parsed character data

  9. Attributes in a DTD • XML elements can have attributes. • General Syntax for DTD: <!ATTLIST element-name attribute-name1 type1 default-value1 …. attribute-namen typen default-valuen> • Example: <!ATTLIST person ssn CDATA REQUIRED> • CDATA means Character data • Default value could be REQUIRED or IMPLIED (meaning optional)

  10. Connecting a Document with its DTD • A DTD can be internal (part of the document file) <?xml version="1.0"?> <!DOCTYPE db [<!ELEMENT ...> … ]> <db> ... </db> • Or external (the DTD and the document are in different files) • A DTD from the local file system: <!DOCTYPE db SYSTEM "schema.dtd"> • A DTD from a remote file system: <!DOCTYPE db SYSTEM "http://www.schemaauthority.com/schema.dtd">

  11. Valid Documents • A document with a DTD is validif it conforms to the DTD, i.e., • the document conforms to the regular-expression grammar, • types of attributes are correct, and • constraints on references are satisfied

  12. DTDs Problems • DTDs are rather weak specifications by DB & programming-language standards • Some limitations: • Only one base type – PCDATA • Also no constraints, e.g range of values, frequency of occurrence • Not easily parsed (since they are not XML) • Not easy to express that element a has exactly the children c, d, e in any order

  13. XML Schema • DTDs are now being superceded by XML schemas. • They provide the following features • XML Syntax • So can be parsed, validated with standard XML tools • Data types other than #PCDATA • There are built in types such as integer, float, boolean, string and many others • Greater control over permitted constructs • Can specify maximum and minimum occurrences • Can use regular expressions to set patterns to be matched • Support for modularity and inheritance

  14. XML Schema continued • XML Schema are more precise and therefore more complicated than DTDs • They were designed to replace DTDs but DTDs are very well established, and simpler • http://www.w3schools.com/schema

  15. Schema types • There are some basic built-in types such as xs:string, xs:decimal, xs:integer, xs:ID • Each element is composed of either simple types or complex types. A complex type is often a sequence of elements • The content of the type can be declared as shown in the following example. A type can also be declared, named and referred to. • Notice the use of minOccurs and maxOccurs. Their default is 1.

  16. standard stuff Top-level element Namespace Simple Schema Example <?xml version="1.0" ?> <xs:schema xmlns:xs= "http://www.w3.org/2001/XMLSchema"> <xs:element name="people"> <xs:complexType> <xs:sequence> <xs:element name="person" maxOccurs = "unbounded"> details of the person element -pto </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

  17. Namespaces • You’ll see namespaces when using XML schemas and stylesheets. • There is a namespace associated with the tags used in each that lets them be used unambiguously. • e.g. a schema element, a chemical element • A namespace is identified by • a short prefix e.g. xs • A unique URL

  18. Namespace declaration • So at the start of a document we must specify what namespaces we are using. • In the schema example, we are using the XML schema namespace with the xs prefix • We declare this namespace in an attribute in the top-level element<xs:schema xmlns:xs= "http://www.w3.org/2001/XMLSchema"> • We then use the xs prefix in all the XML Schema elements e.g. complexType, sequence, element etc

  19. Schema Example Continued Details of the person element <xs:element name="person" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name ="name" type="xs:string"/> <xs:element name = "tel" type="xs:string" /> <xs:element name = "email" type="xs:string" minOccurs="0" maxOccurs="1"/> </xs:sequence> <xs:attribute name= "sssNo" type="xs:integer" use="required"/> </xs:complexType> </xs:element> Empty element A person is a complex type which is a sequence of elements and an attribute

  20. Exercise 1 • Create a schema for the holiday house example. Each home has an id, a name and a location • Additionally, each home has between one and three sets of contact details. Contact details consist of a name and a phone number, and optionally an email address and website.

  21. Restrictions on elements • You can also restrict the values of the data in • a range • <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/> • an enumerated list • <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> • a pattern • <xs:pattern value="([a-z])*"/> • Means 0 or more lowercase alphabetic chars

  22. Declaring your own types • Named types can be used for elements or attributes. Here’s an example which specifies restrictions on the attribute • A named type is declared <xs:simpleType name = "ssstype"> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> </xs:restriction> </xs:simpleType> • And used as the attribute type • <xs:attribute name= "sssNo" type="ssstype" use="required"/>

  23. More complex Schemas • The previous example shows a simple schema. • It is also possible to make the schema easier to maintain • by declaring all the simple elements first and then referring to them in the body of the document • By naming the declaration of simple and complex types, which could then be used later in the document, and more than once if necessary • See http://www.w3schools.com/Schema/schema_example.asp if you are interested

  24. Referring to a schema • Save your schema in a file with the extension xsd. • Linking schema definition with a document is done using a special attribute of the root node of the document: <people xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation=“people.xsd">

  25. Validating • Validators • http://www.w3.org/2001/03/webdata/xsv • I don’t seem to be able to revalidate with the same filenames • http://tools.decisionsoft.com/schemaValidate/ • No problems, nicer layout • Others also on the web

  26. XML: Summary • XML lets you choose application specific element names and define special purpose document types. • Need document type definition or schema to define allowed markup. • What can we do with our valid document? – next 2 lectures

  27. Exercises 2 • Alter the schema given in the lecture notes so that there must be between 1 and 4 tel numbers which must be in the range 1000 – 9999 • Create a simple type for tel

More Related