450 likes | 456 Views
Learn the basics of XML including DTD, XPath, and XSLT. Understand how to create and manipulate XML documents. Explore the Extensible Markup Language and its limitless potential.
E N D
CS 433Xml, DTD, XPath, & Xslt Extensible Markup and Beyond September 26, 2001 Jeff Derstadt
Administration • Due: Friday Sept. 28th • Relational table creation and summary • See course web site for more details • Logging into Egret • Questions?
Overview • Xml • A self-describing, hierarchal data model • DTD • Standardizing schemas for Xml • XPath • How to navigate and query Xml documents • Xslt • How to transform one Xml document into another Xml document
Xml – An Example <class name=‘CS 433’> <location building=‘Olin’ room=‘255’/> <professor>Johannes Gehrke</professor> <ta>Jeff</ta> <student_list> <student id=‘999-991’>John Smith</student> <student id=‘999-992’>Jane Doe</student> </student_list> </class>
Xml – Extensible Markup Language • Language • A way of communicating information • Markup • Notes or meta-data that describe your data or language • Extensible • Limitless ability to define new languages or data sets
Xml – What’s The Point? • You can include your data and a description of what the data represents • This is useful for defining your own language or protocol • Example: Chemical Markup Language <molecule> <weight>234.5</weight> <Spectra>…</Spectra> <Figures>…</Figures> </molecule>
attribute closing tag open tag attribute value data element name Xml – Structure • Xml looks like HTML • Xml is a hierarchy of user-defined tags called elements with attributes and data • Data is described by elements, elements are described by attributes <student id=‘999-991’>John Smith</student>
attribute closing tag open tag attribute value data element name Xml – Elements <student id=‘999-991’>John Smith</student> • Xml is case and space sensitive • Element opening and closing tag names must be identical • Opening tags: “<” + element name + “>” • Closing tags: “</” + element name + “>” • Empty Elements have no data and no closing tag: • They begin with a “<“ and end with a “/>” <location/>
attribute closing tag open tag attribute value data element name Xml – Attributes <student id=‘999-991’>John Smith</student> • Attributes provide additional information for element tags. • There can be zero or more attributes in every element; each one has the the form: attribute_name=‘attribute_value’ • There is no space between the name and the “=‘” • Attribute values must be surrounded by “ or ‘ characters • Multiple attributes are separated by white space (one or more spaces or tabs).
attribute closing tag open tag attribute value data element name Xml - Data <student id=‘999-991’>John Smith</student> • Xml data is any information between an opening and closing tag • Xml data must not contain the ‘<‘ or ‘>’ characters
Xml – Nesting & Hierarchy • Xml tags can be nested in a tree hierarchy • Xml documents can have only one root tag • Between an opening and closing tag you can insert: 1. Data 2. More Elements 3. A combination of data and elements <root> <tag1> Some Text <tag2>More</tag2> </tag1> </root>
Node Type: Element_Node Name: Element Value: Root Node Type: Element_Node Name: Element Value: tag1 Node Node Type: Text_Node Name: Text Value: Some Text Type: Element_Node Name: Element Value: tag2 Node Type: Text_Node Name: Text Value: More Xml – Storage • Storage is done just like an n-ary tree (DOM) <root> <tag1> Some Text <tag2>More</tag2> </tag1> </root>
Xml vs. Relational Model <Table> <Computer Id=‘101’> <Speed>800Mhz</Speed> <RAM>256MB</RAM> <HD>40GB</HD> </Computer> <Computer Id=‘102’> <Speed>933Mhz</Speed> <RAM>512MB</RAM> <HD>40GB</HD> </Computer> </Table> Computer Table
DTD – Document Type Definition • A DTD is a schema for Xml data • Xml protocols and languages can be standardized with DTD files • A DTD says what elements and attributes are required or optional • Defines the formal structure of the language
DTD – An Example <?xml version='1.0'?> <!ELEMENT Basket (Cherry+, (Apple | Orange)*) > <!ELEMENT Cherry EMPTY> <!ATTLIST Cherry flavor CDATA #REQUIRED> <!ELEMENT Apple EMPTY> <!ATTLIST Apple color CDATA #REQUIRED> <!ELEMENT Orange EMPTY> <!ATTLIST Orange location ‘Florida’> -------------------------------------------------------------------------------- <Basket> <Cherry flavor=‘good’/> <Apple color=‘red’/> <Apple color=‘green’/> </Basket> <Basket> <Apple/> <Cherry flavor=‘good’/> <Orange/> </Basket>
DTD - !ELEMENT <!ELEMENT Basket (Cherry+, (Apple | Orange)*) > • !ELEMENT declares an element name, and what children elements it should have • Wildcards: • * Zero or more • + One or more Name Children
DTD - !ATTLIST <!ATTLIST Cherry flavor CDATA #REQUIRED> <!ATTLIST Orange location CDATA #REQUIRED color ‘orange’> • !ATTLISTdefines a list of attributes for an element • Attributes can be of different types, can be required or not required, and they can have default values. Element Attribute Type Flag
DTD –Well-Formed and Valid <?xml version='1.0'?> <!ELEMENT Basket (Cherry+)> <!ELEMENT Cherry EMPTY> <!ATTLIST Cherry flavor CDATA #REQUIRED> -------------------------------------------------------------------------------- Not Well-Formed <basket> <Cherry flavor=good> </Basket> Well-Formed but Invalid <Job> <Location>Home</Location> </Job> Well-Formed and Valid <Basket> <Cherry flavor=‘good’/> </Basket>
XPath – Navigating Xml • When Xml is stored in a tree, XPath allows you to navigate to different nodes: Class <Class> <Student>Jeff</Student> <Student>Pat</Student> </Class> Student Student Text: Jeff Text: Pat
XPath – Navigating Xml • Xml is similar to a file structure, but you can select more than one node: //Class/Student Class <Class> <Student>Jeff</Student> <Student>Pat</Student> </Class> Student Student Text: Jeff Text: Pat
XPath – Navigating Xml • An XPath expression looks just like a file path • Elements are accessed as /<element>/ • Attributes are accessed as @attribute • Everything that satisfies the path is selected • You can add constraints in brackets [ ] to further refine your selection
XPath – Navigating Xml <class name=‘CS 433’> <location building=‘Olin’ room=‘255’/> <professor>Johannes Gehrke</professor> <ta>Dan Kifer </ta> <student_list> <student id=‘999-991’>John Smith</student> <student id=‘999-992’>Jane Doe</student> </student_list> </class> Starting Element Attribute Constraint //class[@name=‘CS 433’]/student_list/student/@id Element Path Selection Selection Result: The attribute nodes containing 999-991 and 999-992
XPath - Context • Context – your current focus in an Xml document • Use: //<root>/… When you want to start from the beginning of the Xml document
XPath - Context XPath: List/Student Class Prof Location List Text: Gehrke Attr: Olin Student Student Text: Jeff Text: Pat
Class Prof Location List Text: Gehrke Attr: Olin Student Student Text: Jeff Text: Pat XPath - Context XPath: Student
XPath – Examples <Basket> <Cherry flavor=‘sweet’/> <Cherry flavor=‘bitter’/> <Cherry/> <Apple color=‘red’/> <Apple color=‘red’/> <Apple color=‘green’/> … </Basket> Select all of the red apples: //Basket/Apple[@color=‘red’]
XPath – Examples <Basket> <Cherry flavor=‘sweet’/> <Cherry flavor=‘bitter’/> <Cherry/> <Apple color=‘red’/> <Apple color=‘red’/> <Apple color=‘green’/> … </Basket> Select the cherries that have some flavor: //Basket/Cherry[@flavor]
XPath – Examples <orchard> <tree> <apple color=‘red’/> <apple color=‘red’/> </tree> <basket> <apple color=‘green’/> <orange/> </basket> </orchard> Select all the apples in the orchard: //orchard/descendant()/apple
Xslt – Transforming Xml Amazon.com order form: <single_book_order> <title>Databases</title> <qty>1</qty> </single_book_order> Supplier’s order form: <form7957> <purchase item=’book’ property=’title’ value=’Databases’ quantity=’1’/> </form7957>
Xslt - Extensible Style Language for Transformation • Xslt is a language for transforming or converting one Xml format into another Xml format. • Benefits: • No need to parse or interpret many different Xml formats – they can all be transformed to a single format to facilitate interpretation • Language looks like Xml! (remember, Xml defines languages!)
Xslt – A First Look <single_book_order> <title>Databases</title> <qty>1</qty> </single_book_order> <form7957> <purchase item=’book’ property=’title’ value=’Databases’ quantity=’1’/> </form7957> <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'> <xsl:template match='single_book_order'> <form7957><purchase item='book' property='title' value='{title}‘ quantity='{qty}'/></form7957> </xsl:template> </xsl:stylesheet>
Xslt – Header • Xslt stylesheets MUST include this body: <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'> … </xsl:stylesheet>
Xslt – Templates • Xslt stylesheets are a collection of templates • Templates are like functions • The body of a template is the output of a transformation
Xslt - Templates • You define a template with the <xsl:template match=‘’>instruction • You call a template with the <xsl:apply-templates select=‘’>instruction 1. All elements or attributes that satisfy the the select attribute expression are selected. 2. For each element or attribute that is selected: i. A matching template is found in the stylesheet. ii. The body of the template is executed.
Xslt – choose Instruction • <xsl:choose>instruction is similar to a C++ or Java switchstatement • <xsl:when test=‘’>instruction is similar to the casestatement • <xsl:otherwise>instruction is similar to the defaultstatement
Xslt – choose Example Original Xml:<customer> <order id=‘5’> <item><title>Database Management Systems</title></item> </order> </customer> Xslt Stylesheet:<xsl:template match=‘customer’> FUNCTION <xsl:choose> SWITCH <xsl:when test='order/@id'> CASE <single_book_order> <title><xsl:value-of select='order/item/title'/></title> </single_book_order> </xsl:when> <xsl:otherwise><single_book_order><fail/> DEFAULT </single_book_order></xsl:otherwise> </xsl:choose> </xsl:template> Output Xml: <single_book_order><title>Database Management Systems</title></single_book_order>
Xslt – choose Example 2 Original Xml:<customer> <order> <item><title>Database Management Systems</title></item> </order> </customer> Xslt Stylesheet:<xsl:template match=‘customer’> FUNCTION <xsl:choose> SWITCH <xsl:when test='order/@id'> CASE <single_book_order> <title><xsl:value-of select='order/item/title'/></title> </single_book_order> </xsl:when> <xsl:otherwise><single_book_order><fail/> DEFAULT </single_book_order></xsl:otherwise> </xsl:choose> </xsl:template> Output Xml: <single_book_order><fail/></single_book_order>
Xslt – for-each Instruction • <xsl:for-each select=‘item’>instruction is similar to a foreach iteratoror a for loop • The selectattribute selects a set of elements from an Xml document
Xslt – if Instruction • <xsl:if test=‘’>instruction is similar to an if statement in Java or C++ • The testattribute is the if condition: • True • statement is true • test returns an element or attribute. • False • statement is false • test returns nothing • There is no ‘else’, so use the <xsl:choose>operator in this situation.
Xslt – for-each and if Example Original Xml:<basket> <apple color=‘red’ condition=‘yummy’/> <apple color=‘green’ condition=‘wormy/> <apple color=‘red’ condition=‘crisp’/> </basket> Xslt Stylesheet: <xsl:template match=‘basket’> FUNCTION <condition_report> <xsl:for-each select=‘apple’> FOR LOOP <xsl:if test=“contains(@color, ‘red’)”> IF <condition><xsl:value-of select=‘@condition’/></condition> </xsl:if> </xsl:for-each> </condition_report> </xsl:template> Output Xml: <condition_report> <condition>yummy</condition> <condition>crisp</condition> </condition_report>
Xslt – Other Information • W3C is standardizing XPath and Xslt: http://www.w3.org/TR/xslt.html http://www.w3.org/TR/xpath.html • Lot’s of Books. Here’s a suggestion: D. Martin et al. Professional Xml. Wrox Press, 2000.
What’s Next? • XSchema • DTDs, but written in XML • Will replace DTDs • XQuery • Fully declarative XML query language • Will be able to do anything you can do with XPath and XSLT, plus a LOT more
Xml in Commercial Databases • Many Xml parsers and XSLT engines are availbable • Microsoft, IBM, and Oracle (among others) are adding native Xml support • Native Xml databases
URL Tutorials http://msdn.microsoft.com/xml/tutorial/default.asp http://www.ils.unc.edu/~kempa/inls259/xml/ http://www.geocities.com/SiliconValley/Peaks/5957/10minxml.html