XML Study-Session: Part II

XML Study-Session: Part II Validating XML Documents

Objectives: By completing this study-session, you should be able to: • Validate XML documents against a DTD. • Understand basic DTD syntax. • Create simple DTDs of your own.

What is a DTD? Document Type Definition: • Standard originally developed for SGML. • Provides a description of the XML document’s structure, and serves as a grammar to specify what tags and attributes are valid in an XML document and in what context they are valid. • E.g. The following is an example DTD statement: <!ELEMENT person (name, e-mail*)>

Why use a DTD? DTDs are used to allow an application to construct valid XML that conforms to that specification. Also: • Self documentation • Portability • Provides defaults for attributes • Entity declaration

Using a DTD in an XML document An XML document may do any of the following: • Refer to a DTD, using its URI. • Include a DTD inline as part of the XML document. • Omit a DTD altogether. Without a DTD, an XML document can be checked for well-formedness, but not for validity. The DTD used by the XML document may be internal or external. An external DTD is stored as an ASCII text .dtd file.

Example: Using a DTD inline <?xml version=‘1.0’ encoding=‘UTF-8’?> <!DOCTYPE Book [ <!ELEMENT Book (Title, Author+, Summary*, Note?)> <!ATTLIST Book ISBN CDATA #REQUIRED section (fiction|nonfiction) ‘fiction’> <!ELEMENT Title(#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Summary(#PCDATA)> <!ENTITY Description ‘A great American novel.’> ]> <Book ISBN=‘1234’> <Title> To Kill a Mockingbird </Title> <Author> Harper Lee </Author> <Summary> &Description; </Summary> </Book>

Doctype declaration The Document Type (Doctype) declaration is used to indicate the DTD used for the document. Syntax may be in any of the following forms: • <!DOCTYPE rootname [DTD]> • <!DOCTYPE rootname SYSTEM URL> • <!DOCTYPE rootname SYSTEM URL [DTD]> • <!DOCTYPE rootname PUBLIC identifier URL> • <!DOCTYPE rootname PUBLIC identifier URL [DTD]>

Example: External DTD The following is an example of an XML document that uses an external DTD: <?xml version=‘1.0’ standalone=‘no’?> <!DOCTYPE Book SYSTEM ‘booklist.dtd’> <Book ISBN=‘4576’> <Title> Moby Dick </Title> <Author> Herman Melville </Author> </Book> The external DTD must be located in the same directory as the XML document.

Example: Using DTDs with URLS The following is an example of an XML document that references an external DTD with an URL: <?xml version=‘1.0’ standalone=‘no’?> <!DOCTYPE Book SYSTEM http://www.somewebsite.com/booklist.dtd> <Book ISBN=‘4576’> <Title> Moby Dick </Title> <Author> Herman Melville </Author> </Book>

Specifying Elements • In the DTD, this is done with the notation: <!ELEMENT elemName elemDefinitionOrType> where elemName is the actual element name, and elemDefinitionOrType indicates whether the content of the content is pure data or a compound type of data and other elements.

Some Element Types • The element type keyword ANY allows the element to contain textual data, nested elements, or any legal XML combination of the two. • The element type keyword #PCDATA indicates textual data, and can be used to store regular character data we want the XML document to handle normally. • The element type keyword EMPTY indicates that the element is always empty.

Nesting elements • To define the allowed nestings within a DTD, the following notation is used: <!ELEMENT elemName (nestedElem, nestedElem, …)> where the order of elements is enforced as a validity constraint within an XML document. • By default, an element can appear exactly once when specified without any modifiers in the DTD.

Recurrence Operators: Recurrence operators can be used to indicate how many times an element must appear in an XML document:

Grouping elements • Often, recurrence occurs for a block or group of elements rather than with a single element. • To signify a group, enclose a set of elements within parantheses. Nested parentheses are acceptable. • In this way, a recurrence operator can then be applied to the group. • E.g. <!ELEMENT groupingExample ((group1Elem1, group1Elem2)+, (group2Elem1, group2Elem2)?)+>

Either Or • In the DTD, an “OR” operator is signified by using |. This allows one thing or the other to occur, and can be used in conjunction with groupings. • E.g. <!ELEMENT aggregateElement (#PCDATA|Element1|Element2)*>

Defining Attributes • Attribute definitions are in the following form: <!ATTLIST enclosingElement attributName attributeType attributeModifier …> • The attributeType keyword CDATA allows an attribute to take on any value, and may represent a comment or additional information about an element. • Another attribute type is an enumeration, where any of the specified values may be used, but any other value for the attribute results in an invalid document. • E.g. <!ATTLIST elementName attribuetName (value1|value2) attributeModifier …>

Attribute Modifiers • We can indicate in the attribute definition whether the attribute is required within an element. • The three modifier keywords are: #IMPLIED, #REQUIRED, and #FIXED. • An implied attribute may be given a value, or left unspecified. • A required attribute must be given a value. • A fixed attribute has a specified value that can never change. The notation for this is: <!ATTLIST elementName attributName #FIXED fixedValue>

Parameter Entities in DTDs • Parameter entities are entities that can only be used in the DTD. • A simple internal parameter entity has the format: <!ENTITY % name definition> • E.g. <?xml version=‘1.0’ standalone=‘yes’> <!DOCTYPE Book [ <!ENTITY % sum “<!ELEMENT Summary (#PCDATA)>”> <!ELEMENT Book (Title, Author+, Summary*, Note?)> <!ELEMENT Title(#PCDATA)> <!ELEMENT Author (#PCDATA)> %sum; ]> …

Parameter Entities in DTDs (contd.) • External parameter entitites can be declared using the following: <!ENTITY % name SYSTEM URI> or <!ENTITY % name PUBLIC identifier URI> • E.g. The following ‘orders.dtd’ file could be created: <!ENTITY % record "(Name, Date, Orders)"> <!ELEMENT Store (Customer|Buyer|Supplier)*> <!ELEMENT Customer %record;> <!ELEMENT Buyer %record;> <!ELEMENT Supplier %record;> <!ELEMENT Name (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT Orders (Product|Price)> <!ELEMENT Product (#PCDATA)> <!ELEMENT Price (#PCDATA)> <!ENTITY % XHTML1 –t.dtd PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd> %XHTML1-t.dtd

Using INCLUDE and IGNORE • We can customize our DTDs using the INCLUDE and IGNORE statements, which have the following syntax: <![INCLUDE [DTD sections]]> <![IGNORE [DTD sections]]> • E.g. In the ‘orders.dtd’ file, add the following lines: <!ENTITY % includer “INCLUDE”> …(same as before)… <![includer; [ <ELEMENT Product_ID (#PCDATA)> <ELEMENT Ship_Date (#PCDATA)> <ELEMENT Tax (#PCDATA)> ]]>

Example: Using the XHTML 1.1 DTD • The XHTML 1.1 DTD is a DTD driver which includes various XHTML 1.1 modules (i.e. DTD sections) using parameter entities. • E.g.  <ENTITY % xhtml-table.module “INCLUDE”> <![%xhtml-table.module;[ <ENTITY % xhtml-table.mod PUBLIC “-//W3C//ELEMENTS XHTML 1.1 Tables 1.0//EN” “xhtml11-table-1.mod”> %xhtml-table.mod;]]> • The above allows us to customize the XHTML 1.1 DTD to include/exclude support for tables.

Next session: Parsing XML Documents • Parsing techniques • Writing your own XML applications

XML Study-Session: Part II

XML Study-Session: Part II

Presentation Transcript

Break-Out Session Probation Part II

Session 6 - 7: Major Coastal Hazards (Session 7: Part II)

Semester Exam Study Part II

XML Part 2

Session I, Part II Teaching Techniques

Session II

AP Review Session Part II

Music Session Part II

XML Study-Session: Part I

Session I Part II: WS Standards

Week 12 – XML Part II

Budget Study Session II

Session 7: Part (ii)

Study Guide - Part II

Case Study Part II: Behavior Intervention

Bible Study Methods, Part II

Transforming XML Part II

Briefing Session on Joint Recruitment (Part II)

Central America Region Study Part II

XML Study-Session: Part III

Session I Part II: WS Standards

Session II