380 likes | 499 Views
Document Type Definitions. Kanda Runapongsa ( krunapon@kku.ac.th ) Dept. of Computer Engineering Khon Kaen University. DTD: Document Type Definition. Define the structure of XML documents with a list of legal elements Why use a DTD?
E N D
Document Type Definitions Kanda Runapongsa (krunapon@kku.ac.th) Dept. of Computer Engineering Khon Kaen University
DTD: Document Type Definition • Define the structure of XML documents with a list of legal elements • Why use a DTD? • Different groups of people can agree to use a common set of tags for interchanging data • To verify that the data received from others is valid 168493: XML and Web Services (II/2546)
A DTD of the Sample Document <!ELEMENT Note (To, From, Course*)> <!ELEMENT To (#PCDATA)> <!ELEMENT From (#PCDATA)> <!ELEMENT Course (Name, Time+, Place?)> <!ATTLIST Course id ID #REQUIRED> 168493: XML and Web Services (II/2546)
Building Blocks of XML • Elements • Attributes • Entities: define common text • PCDATA: parsed character data • Text that will be parsed by a parser • Tags inside the text are treated as markup • CDATA: character data • Text is not parsed by a parser. Text is kept as it is 168493: XML and Web Services (II/2546)
Element Type Declarations • Each element type declaration starts with <!ELEMENT • Followed by the name and a content specification • Its structure • <!ELEMENT elementName (contentSpec)> • Example • <!ELEMENT Note (To, From, Course*)> 168493: XML and Web Services (II/2546)
Empty Elements • Empty elements are declared with keyword EMPTY • DTD example: <!ELEMENT br EMPTY> • XML example: <br></br> or <br/> 168493: XML and Web Services (II/2546)
Elements with Only Character Data • Elements with only character data are declared with #PCDATA inside parentheses • DTD example: <!ELEMENT To (#PCDATA)> • XML example: <To>Students</To> 168493: XML and Web Services (II/2546)
Elements with any contents • Declared with keyword ANY • Can contain any combination of parsable data • DTD example: <!ELEMENT Misc ANY> • XML example: <Misc>Hello there</Misc> 168493: XML and Web Services (II/2546)
Elements with Sequence Children • Defined with the name of the children inside parentheses • Children are declared in a sequence separated by commas • Children must appear in the same sequence in the document • DTD example: <!ELEMENT Note (To, From, Course*)> 168493: XML and Web Services (II/2546)
Occurrence Types • Element with only one occurrence • No symbol after the element • Ex: <!ELEMENT Note (To, From, Course*)> • Element with the minimum of one occurrence • Symbol + after the element • Ex: <!ELEMENT Course (Name, Time+, Place?)> 168493: XML and Web Services (II/2546)
Occurrence Types (Cont.) • Element with zero or more occurrences • Symbol * after an element • Ex: <!ELEMENT Note (To, From, Course*)> • Element with zero or one occurrence • Symbol ? after an element • <!ELEMENT Course (Name, Time+, Place?)> 168493: XML and Web Services (II/2546)
Elements with Content Options • Provide an alternative with symbol | • DTD example: <!ELEMENT n (fn|nn)> • XML example: <n><fn>Duangporn</fn></n> <n><nn>Nok</nn></n> 168493: XML and Web Services (II/2546)
Elements with Mixed Content • Mixed with children and parsed character data • Declared with symbol | • DTD example <!ELEMENT MixedE (#PCDATA|n)*> • XML example <MixedE><n><nn>joy<nn></n> is a person</MixedE> 168493: XML and Web Services (II/2546)
Attributes Declarations • Attribute declarations • Start with the string “<!ATTLIST” • Followed with an element name • A list of the information of attributes • The general structure • <!ATTLIST elementName (attName attType default)+> 168493: XML and Web Services (II/2546)
Attribute Declarations • ID: Identifier • DTD: <!ATTLIST Course id ID #REQUIRED> • XML: <Course id=“168493”> • CDATA: “character data”: not parsed • DTD: <!ATTLIST ARTICLE DATE CDATA #REQUIRED> • XML: <ARTICLE DATE=“January 1, 2000”> 168493: XML and Web Services (II/2546)
Attribute Defaults • Attributes can have default values. • For some attributes, the XML author does not need to specify an attribute value • The processor can supply the default value if it exists • But there are some attributes that the attribute values need to be specified 168493: XML and Web Services (II/2546)
Attribute Defaults (Cont.) • Default values • The DTD author specifies the default value • Implied attributes • The processor specifies the default value • Required attributes • The XML author specifies the default value • Fixed attributes • The attribute value is fixed and specified by the DTD author 168493: XML and Web Services (II/2546)
Default Values • Include the default value after the type or list of allowed values in the attribute list declaration • Examples: • DTD: <!ATTLIST SHIRT SIZE (SMALL|MEDIUM|LARGE) MEDIUM> • XML: <SHIRT><color>red</color></SHIRT> 168493: XML and Web Services (II/2546)
Impliable Attributes • Allow the user to omit a value for a particular attribute without forcing a particular default • Examples: some shirts are “one size fits all” • DTD: <!ATTLIST SHIRT SIZE NMTOKEN #IMPLIED> • XML: <SHIRT><color>red</color></SHIRT> • Leave the values to be assigned by a processor or to be ignored 168493: XML and Web Services (II/2546)
Required Attributes • The XML author is required to specify the attribute values • A value for an attribute is important and cannot reliably be defaulted • Examples • DTD: <!ATTLIST Course id ID #REQUIRED> • XML: <Course id=“168493”> 168493: XML and Web Services (II/2546)
Fixed Attributes • Attribute values cannot be overridden at all • For the purpose of easy integration between documents • <!ATTLIST CHAPTER TITLE-LEVEL CDATA #FIXED “FIRST”> • <!ATTLIST SECTION TITLE-LEVEL CDATA #FIXED “SECOND”> • <!ATTLIST SUBSECTION TITLE-LEVEL CDATA #FIXED “THIRD”> 168493: XML and Web Services (II/2546)
Attribute Types • Attribute value normalization • CDATA and name token attributes • Enumerated and notation attributes • ID and IDREF attributes • ENTITY attributes 168493: XML and Web Services (II/2546)
Attribute Values and Types • The value of an attribute is not necessary the exact character string that you enter between the quotation marks • The string first go through a process called attribute-value normalization • Attribute types apply to the normalized value 168493: XML and Web Services (II/2546)
Attribute Value Normalization • XML processors normalize values to make authors’ lives simpler • If it were not for normalization, • The XML authors must be very careful where and how they put white spaces in an attribute value • All XML attribute values are entered as quoted strings 168493: XML and Web Services (II/2546)
Normalization Process • 1. Strip off the surrounding quotes • 2. Character and entity references are replaced • 3. Newline characters are replaced by spaces • Examples: |“ token “| |token| 168493: XML and Web Services (II/2546)
CDATA Attributes • CDATA stands for “character data” • DTD: <!ATTLIST ARTICLE DATE CDATA #REQUIRED> • XML: <ARTICLE DATE=“January 1, 2000”> … </ARTICLE> 168493: XML and Web Services (II/2546)
NMTOKEN Attributes • NMTOKEN: Name Token attributes • Similar to CDATA but restricted in the characters that name tokens allow • Name tokens are composed of strings made up of letters, numbers, and a select group of special characters • Period (.), Dash (-), Underscore (_), and colon (:) 168493: XML and Web Services (II/2546)
NMTOKEN Examples • DTD: <!ATTLIST TABLE NAME NMTOKEN #REQUIRED FIELDS NMTOKENS #REQUIRED> • XML: <TABLE NAME=“SECURITY” FIELDS=“USERID PASSWORD DEPARTMENT”> … </TABLE> 168493: XML and Web Services (II/2546)
Entity Declarations • Allow you to associate a name with a fragment which can be • A piece of common text • A reference to an external file containing either text or binary data • Three kinds of entities • Internal entities • External entities • Parameter entities 168493: XML and Web Services (II/2546)
Internal Entities • Associate a name with a string of literal text • Examples: • DTD: <!ENTITY KKU “Khon Kaen U.”> • XML: &KKU; is in Thailand 168493: XML and Web Services (II/2546)
External Entities • Allow an XML document to refer to the contents of another file • If another file contains text, • Its content is inserted at the point of reference and parsed as part of the document • If another file contains binary data, • Its content is not parsed • May only be referenced in an attribute 168493: XML and Web Services (II/2546)
External Entities (Cont.) • Another file is a text file <!ENTITY sample SYSTEM “/standard/sample.xml”> • Another file contains binary data, such as figures <!ENTITY logo SYSTEM “/standard/logo.gif” NDATA GIF87A> 168493: XML and Web Services (II/2546)
Parameter Entities • Can only occur in the DTD • A parameter entity declaration is identified by placing % (percent-space) in front of its name in the declaration • The percent sign is also used in references to parameter entities, instead of the ampersand 168493: XML and Web Services (II/2546)
Parameter Entities (Cont.) • Examples: • Without parameter entities <!ELEMENT mixed (#PCDATA | t)*> <!ELEMENT misc (#PCDATA | t)*> • With parameter entities <!ENTITY % m “#PCDATA | t”> <!ELEMENT mixed (%m;)*> <!ELEMENT misc (%m;)*> 168493: XML and Web Services (II/2546)
Notation Declarations • Provide a name for the notation • Which may allow an XML processor to locate an application capable of processing data in the given notation • Example: <!NOTATION GIF87A SYSTEM “GIF.EXE”> <!NOTATION HTML SYSTEM http://www.w3.org/Markup> 168493: XML and Web Services (II/2546)
Declaring a DTD • Two ways that a DTD can be declared • Inline in an XML document (Internal DOCTYPE declaration) • As an external reference (External DOCTYPE declaration) 168493: XML and Web Services (II/2546)
Internal DOCTYPE declaration <?xml version=“1.0” encoding=“ISO-8859-1”?> <!DOCTYPE Note [ <!ELEMENT Note (To, From, Course*)> … ]> <Note> … </Note> 168493: XML and Web Services (II/2546)
External DOCTYPE Declaration <?xml version=“1.0” encoding=“ISO-8859-1”?> <!DOCTYPE Note SYSTEM “Note.dtd”> <Note> … </Note> 168493: XML and Web Services (II/2546)