510 likes | 536 Views
Universal Database Systems. Part 4: Databases and XML. Overview. Introduction to XML DTDs and Schemas for XML Documents Languages for XML, in particular XSL Querying and Storing XML Summary and Outlook. In Detail. XML Stylesheet Language (XSL)
E N D
Universal Database Systems Part 4: Databases and XML
Overview • Introduction to XML • DTDs and Schemas for XML Documents • Languages for XML, in particular XSL • Querying and Storing XML • Summary and Outlook UDBS Part 4 - Winter 2001/2
In Detail • XML Stylesheet Language (XSL) • XSL Components XSLT, XPath, XSL-FO:templates, rules, and patterns for transforming XML documents • Sample Transformations UDBS Part 4 - Winter 2001/2
Processing XML Documents • Programs don´t know trees ! • However, objects (e.g., in Java) and elements (in XML) are "similar": • potentially complex structure • stylesheet or XSLT defines "behavior" • Desirable:alternative representations for XML documents UDBS Part 4 - Winter 2001/2
XSL(XML Stylesheet Language) • XSL is composed of • XPath (XML Path Language):Language for addressing parts of an XML document • XSLT (XSL Transformation Language):language for the specification of tree transformations • XSL FO (XSL Formatting Objects):Language for formatting document contents via areas UDBS Part 4 - Winter 2001/2
XSL Does • Formatting, orXML-to-HTML transformations • Also: XML-to-XMLtransformations • Hence useful for "queries" or thespecification of tree transformations: • Input: document and XSL transformation • From this, some elements are taken (e.g., selected) • Output is another XML document UDBS Part 4 - Winter 2001/2
Transformation Source tree Result tree Principle UDBS Part 4 - Winter 2001/2
XSL stylesheet 2 XSL stylesheet 3 Application of XSL XML document XSL stylesheet 1 XSL processor UDBS Part 4 - Winter 2001/2
Procedure source tree XSL program read template yes find source node more templates? no formatresult tree transform source node into result node present result tree UDBS Part 4 - Winter 2001/2
XSL(T) "Computation" • Given: a document t • Start from the root of t in start mode • If a node u is reached in mode q, the program tries to find a template (rule) with mode q and pattern fitting u • If found, the template is executed (output is generated, and nodes are selected for further processing) • The selected nodes are processed independently • The documents so derived are collected as final output UDBS Part 4 - Winter 2001/2
Rule 1 Pattern Template Rule 2 Sample TemplateRules (1/4) Show all book titles: <xsl:template match="/" > <xsl:apply-templates/></xsl:template> <xsl:template match="book/title"> <result> <xsl:value-of select="."/> </result> </xsl:template> UDBS Part 4 - Winter 2001/2
Sample TemplateRules (2/4) • If above two rules are part of a stylesheet then: • First rule matches root element and triggers matching of further rules. • Second rule gets executed for every title-element under a book-element and outputs a result-element with title text. • Output: • Title texts inside result-element and • texts of all other elements (authors, ...) UDBS Part 4 - Winter 2001/2
Sample TemplateRules (3/4) • Reason: Built-in Template Rules • XSLT has built-in rules • to continue processing recursively even if no matching rules are explicitly given(thus, the first of the above two rules can be omitted) • to copy text of nodes without (explicit) matching rules through(therefore, additional text in output) UDBS Part 4 - Winter 2001/2
Sample TemplateRules (4/4) <xsl:template match="/"> <xsl:apply-templates select="//book/title"/> </xsl:template> <xsl:template match="book/title"> <result> <xsl:value-of select="."/> </result> </xsl:template> • Outputs only result-elements with text of booktitles. • (Hint concerning xLx: this won't work as the output is not well-formed XML - root missing...) UDBS Part 4 - Winter 2001/2
XSL Fundamentals • Program = set of template rules • Template rule= matching pattern [+ mode] + template • Note:syntax is pure XML(with special tags likexsl:template, xsl:apply-templates, xsl:value-of) UDBS Part 4 - Winter 2001/2
Sample Linear Patterns book book element book/title title elementwithin book book//first-name first-name withinbook atarbitrary depth //first-name first-name at arbitrary depth * anyelement / the root /book the book element following the root book/@language the language attribute of book book/author[position()=2] the second author within book UDBS Part 4 - Winter 2001/2
Sample Non-Linear Patterns book[author/first-name]/titletitlewithinbook, provided there is anauthorwith sub-elementfirst-name //book[author,date/year]//title[subtitle]titleofbookat arbitrary depth, providedbookhas anauthorand adatewith sub-elementyear,andtitlehas asubtitle UDBS Part 4 - Winter 2001/2
Example 1 (a) Find all technical books at e-shopper‘s("selection" on the set of books) (b) Find books and videos from the same year at e-shopper‘s("join" of books and videos) UDBS Part 4 - Winter 2001/2
Example 1 (a) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:for-each select="//BOOK[@category]"> <xsl:if test="@category='technical'"> <TechnicalBook> <xsl:value-of select="TITLE"/> </TechnicalBook> </xsl:if> </xsl:for-each> </xsl:template> </xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Execution Sample command:saxon –o tb.xml catalog.xml technical-books.xsl Result: <?xml version="1.0" encoding="utf-8"?><TechnicalBook> Database System Concepts</TechnicalBook> <TechnicalBook> Data on the Web: From Relations to Semistructured Data and Xml</TechnicalBook> UDBS Part 4 - Winter 2001/2
Input Tree bookcatalog book book ISBN ISBN author title year author . . . . . edition title person publisher . . . . . UDBS Part 4 - Winter 2001/2
Result Tree <TechnicalBooks> TechnicalBook TechnicalBook Database System Concepts Data on the Web UDBS Part 4 - Winter 2001/2
XSL Language Elements • xsl:stylesheet • xsl:template • xsl:call-template • xsl:apply-templates • xsl:value-of • xsl:for-each • xsl:if, xsl:choose, xsl:when, xsl:otherwise • Many others controlstructures UDBS Part 4 - Winter 2001/2
Also . . . • Variables • Modi • Named templates UDBS Part 4 - Winter 2001/2
Example 1 (b) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:for-each select="//BOOK[YEAR]"> <xsl:variable name="book" select="."/> <xsl:for-each select="//VIDEO[YEAR=$book/YEAR]"> <xsl:variable name="video" select="."/> <BookAndVideoInYear> <Year><xsl:value-of select="$book/YEAR"/></Year> <Book><xsl:value-of select="$book/TITLE"/></Book> <Video><xsl:value-of select="$video/TITLE"/></Video> </BookAndVideoInYear> </xsl:for-each></xsl:for-each> </xsl:template></xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Execution Sample command:saxon –o sy.xml catalog.xml same-year.xsl Result: <?xml version="1.0" encoding="utf-8"?> <BookAndVideoInYear> <Year>2000</Year> <Book>Fermat‘s Last Theorem</Book> <Video>The Sixth Sense</Video> </BookAndVideoInYear> <BookAndVideoInYear> <Year>1999</Year> <Book> Data on the Web: From Relations to Semistructured Data and Xml</Book> <Video>Pippi Langstrumpf</Video> </BookAndVideoInYear> UDBS Part 4 - Winter 2001/2
e-shopper‘sheaven e-shopper‘s_heaven.com e-shopper‘s_heaven.com • How can e-shopper‘s make its product information available to the Web user ? UDBS Part 4 - Winter 2001/2
Example 2: XML to HTML Generate an HTML table (to be viewed in a browser) of books in the following form: Book Entries UDBS Part 4 - Winter 2001/2
Example 2 (cont‘d) <xsl:template match="/"><HTML><HEAD><TITLE>Book Entries</TITLE></HEAD><BODY> <xsl:apply-templates/></BODY></HTML> </xsl:template>creates the result document <xsl:template match="bookcatalog"><TABLE><TBODY><xsl:apply-templates/></TBODY></TABLE> </xsl:template>creates a table skeleton and calls the program recursively with sub-elements UDBS Part 4 - Winter 2001/2
Example 2 (cont‘d) <xsl:template match="book"><TR><xsl:apply-templates select="title"/> <xsl:apply-templates select="author"/></TR> </xsl:template> creates a table row <xsl:template match="title"><TD><xsl:value-of select="."/></TD> </xsl:template> <xsl:template match="author"><TD><xsl:value-of select="."/></TD> </xsl:template> creates a table entry UDBS Part 4 - Winter 2001/2
Complete Program <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <HTML><HEAD><TITLE>Book Entries</TITLE></HEAD> <BODY> <xsl:apply-templates/></BODY> </HTML> </xsl:template> <xsl:template match="BOOKCATALOG"> <TABLE><TBODY><xsl:apply-templates/></TBODY></TABLE> </xsl:template> <xsl:template match="BOOK"><TR><xsl:apply-templates select="TITLE"/> <xsl:apply-templates select="AUTHOR"/></TR> </xsl:template> <xsl:template match="TITLE"><TD><xsl:value-of select="."/></TD></xsl:template> <xsl:template match="AUTHOR"><TD><xsl:value-of select="."/></TD></xsl:template> </xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Result <HTML><HEAD> <TITLE>Book Entries</TITLE></HEAD><BODY> <TABLE><TBODY> <TR><TD> title1 </TD><TD> author1 </TD> ... </TR> <TR><TD> title2 </TD><TD> author2 </TD> ... </TR> . . . . . </TBODY></TABLE></BODY> </HTML> UDBS Part 4 - Winter 2001/2
Example 3 <!DOCTYPE organization [<!ELEMENT organization (group+, topmgr)><!ELEMENT topmgr (employee+)><!ELEMENT group ((mgr, group+) | employee+)><!ELEMENT mgr (employee)><!ATTLIST employee id ID #REQUIRED><!ATTLIST group id ID #REQUIRED> ]> UDBS Part 4 - Winter 2001/2
Desired XSLT Program • Determine pairs (e1, e2) of employees such that e1 is a top manager <> Bill and a direct or indirect manager of e2 • Pairs are represented als element <pair> with attributes topmgrID and employeeID • Idea: compute a join between the list of top managers and the group managers UDBS Part 4 - Winter 2001/2
Example 3: Sample Document <organization><group id="HR"> <mgr><employee id="Bill"/></mgr> <group id="HR-prod"> <mgr><employee id="Edna"/></mgr> <group id="HR-prod-empl"> <employee id="Kate"/><employee id="Ronald"/> </group></group> <group id="HR-QA"> <mgr><employee id="John"/></mgr> <group id="HR-QA-empl"> <employee id="Jane"/><employee id="Jake"/> </group></group></group><topmgr><employee id="Bill"/><employee id="John"/></topmgr> </organization> UDBS Part 4 - Winter 2001/2
Sample Document as Tree topmanager group HR Bill John manager group HR-prod group HR-QA Bill manager manager KateRonald JaneJake Edna John UDBS Part 4 - Winter 2001/2
XSLT Program (1/3) <xsl:template match="/"> <xsl:apply-templates mode="start"/> </xsl:template> <xsl:template match="organization" mode="start"><result> <xsl:apply-templates select="/organization/topmgr/employee" mode="selecttopmgr"/></result> </xsl:template> UDBS Part 4 - Winter 2001/2
Idea (1) • Go into the initial state (Rule 1a), then start with applying Rule 1b to the root (in start mode) • Select each top manager, i.e., compare pattern /organization/topmgr/employee to the current organization node and select the employee children of topmgr UDBS Part 4 - Winter 2001/2
Idea (2) • Apply Rule 2 to each selected employee and store its ID e1 in varID • Verify that e1 <> Bill • If true, select all successors of a group manager with an ID of e1 (e1 is passed as parameter) UDBS Part 4 - Winter 2001/2
Select all employees who are successor of a group whose manager has the same ID as the one stored in varID XSLT-Programm (2/3) <xsl:template match="employee" mode="selecttopmgr"><xsl:variable name="varID"> <xsl:value-of select="@id"/></xsl:variable><xsl:if test="$varID != ´Bill´"> <xsl:apply-templates mode="display" select="//group[mgr/employee[@id=$varID]]//employee"> <xsl:with-param name="varID" select="$varID"/> </xsl:apply-templates></xsl:if> </xsl:template> UDBS Part 4 - Winter 2001/2
Idea (3) • For each selected employee e2 create as output a pair of values e1 and e2 for attributes topmgrID and employeeID • Note: this program can be improved UDBS Part 4 - Winter 2001/2
XSLT Program (3/3) <xsl:template match="employee" mode="display"><xsl:param name="varID"/><pair> <xsl:attribute name="topmgrID"> <xsl:value-of select="$varID"/> </xsl:attribute> <xsl:attribute name="employeeID"> <xsl:value-of select="@id"/> </xsl:attribute></pair> </xsl:template> UDBS Part 4 - Winter 2001/2
Discussion • Modes enable a different behaviour of an XSLT program when the same element is found repeatedly (e.g., to select or to show an employee) • Variables enable a computation of joins between data values • Value passing between rules is useful for joins as well as for groupings UDBS Part 4 - Winter 2001/2
Example 3: Result <result><pair topmgrID="John" employeeID="Jane"/><pair topmgrID="John" employeeID="Jake"/> </result> UDBS Part 4 - Winter 2001/2
Example 4: gcd (Euklid) gcd(a,b) repeat r := a mod b; a := b; b := runtil r = 0;write (a) Sample input document: <gcd><input> <no1>36</no1> <no2>21</no2> </input></gcd> Read input and start algorithm: <xsl:template match="input"> <xsl:call-template name="gcd"> <xsl:with-param name = "a" select="no1"></xsl:with-param> <xsl:with-param name = "b" select="no2"></xsl:with-param> </xsl:call-template> </xsl:template> UDBS Part 4 - Winter 2001/2
gcd (Euklid) <xsl:template name="gcd"> <xsl:param name="a"/><xsl:param name="b"/> <xsl:variable name="r" select="$a mod $b"/> <xsl:choose> <xsl:when test="$r = 0"> <result><xsl:value-of select="$b"/></result> </xsl:when> <xsl:otherwise> <xsl:call-template name="gcd"> <xsl:with-param name = "a" select="$b"></xsl:with-param> <xsl:with-param name = "b" select="$r"></xsl:with-param> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template> UDBS Part 4 - Winter 2001/2
Some Functions • +, - • div, * • mod • count • sum UDBS Part 4 - Winter 2001/2
XSL Summary • XML syntax • XML output • Language is "relationally complete” • The semantics is complex • Language can be used for querying XML documents UDBS Part 4 - Winter 2001/2
Stylesheets • For formatting documents • Use XSLT templates with elements like • xsl:import • xsl:include • xsl:output • xsl:preserve-space UDBS Part 4 - Winter 2001/2
XSL-FO • Formatting is done according to the area model (as in CSS) • A presentation area is decomposed into areas which can follow each other (block areas) or be nested into each other (inline areas) • Areas can have attributes, e.g., for determining point size and font, block margins UDBS Part 4 - Winter 2001/2