510 likes | 537 Views
Explore the essentials of XML transformation, including XSL components, querying/storing XML, and creating diverse output formats for databases. Learn the principles and patterns of XSL through detailed examples and sample transformations. Understand the powerful capabilities of XSLT and XSL-FO to format XML documents efficiently and effectively.
E N D
Universal Database Systems Part 4: Databases and XML
Overview • Introduction to XML • DTDs and Schemas for XML Documents • Languages for XML, in particular XSL • Querying and Storing XML • Summary and Outlook UDBS Part 4 - Winter 2001/2
In Detail • XML Stylesheet Language (XSL) • XSL Components XSLT, XPath, XSL-FO:templates, rules, and patterns for transforming XML documents • Sample Transformations UDBS Part 4 - Winter 2001/2
Processing XML Documents • Programs don´t know trees ! • However, objects (e.g., in Java) and elements (in XML) are "similar": • potentially complex structure • stylesheet or XSLT defines "behavior" • Desirable:alternative representations for XML documents UDBS Part 4 - Winter 2001/2
XSL(XML Stylesheet Language) • XSL is composed of • XPath (XML Path Language):Language for addressing parts of an XML document • XSLT (XSL Transformation Language):language for the specification of tree transformations • XSL FO (XSL Formatting Objects):Language for formatting document contents via areas UDBS Part 4 - Winter 2001/2
XSL Does • Formatting, orXML-to-HTML transformations • Also: XML-to-XMLtransformations • Hence useful for "queries" or thespecification of tree transformations: • Input: document and XSL transformation • From this, some elements are taken (e.g., selected) • Output is another XML document UDBS Part 4 - Winter 2001/2
Transformation Source tree Result tree Principle UDBS Part 4 - Winter 2001/2
XSL stylesheet 2 XSL stylesheet 3 Application of XSL XML document XSL stylesheet 1 XSL processor UDBS Part 4 - Winter 2001/2
Procedure source tree XSL program read template yes find source node more templates? no formatresult tree transform source node into result node present result tree UDBS Part 4 - Winter 2001/2
XSL(T) "Computation" • Given: a document t • Start from the root of t in start mode • If a node u is reached in mode q, the program tries to find a template (rule) with mode q and pattern fitting u • If found, the template is executed (output is generated, and nodes are selected for further processing) • The selected nodes are processed independently • The documents so derived are collected as final output UDBS Part 4 - Winter 2001/2
Rule 1 Pattern Template Rule 2 Sample TemplateRules (1/4) Show all book titles: <xsl:template match="/" > <xsl:apply-templates/></xsl:template> <xsl:template match="book/title"> <result> <xsl:value-of select="."/> </result> </xsl:template> UDBS Part 4 - Winter 2001/2
Sample TemplateRules (2/4) • If above two rules are part of a stylesheet then: • First rule matches root element and triggers matching of further rules. • Second rule gets executed for every title-element under a book-element and outputs a result-element with title text. • Output: • Title texts inside result-element and • texts of all other elements (authors, ...) UDBS Part 4 - Winter 2001/2
Sample TemplateRules (3/4) • Reason: Built-in Template Rules • XSLT has built-in rules • to continue processing recursively even if no matching rules are explicitly given(thus, the first of the above two rules can be omitted) • to copy text of nodes without (explicit) matching rules through(therefore, additional text in output) UDBS Part 4 - Winter 2001/2
Sample TemplateRules (4/4) <xsl:template match="/"> <xsl:apply-templates select="//book/title"/> </xsl:template> <xsl:template match="book/title"> <result> <xsl:value-of select="."/> </result> </xsl:template> • Outputs only result-elements with text of booktitles. • (Hint concerning xLx: this won't work as the output is not well-formed XML - root missing...) UDBS Part 4 - Winter 2001/2
XSL Fundamentals • Program = set of template rules • Template rule= matching pattern [+ mode] + template • Note:syntax is pure XML(with special tags likexsl:template, xsl:apply-templates, xsl:value-of) UDBS Part 4 - Winter 2001/2
Sample Linear Patterns book book element book/title title elementwithin book book//first-name first-name withinbook atarbitrary depth //first-name first-name at arbitrary depth * anyelement / the root /book the book element following the root book/@language the language attribute of book book/author[position()=2] the second author within book UDBS Part 4 - Winter 2001/2
Sample Non-Linear Patterns book[author/first-name]/titletitlewithinbook, provided there is anauthorwith sub-elementfirst-name //book[author,date/year]//title[subtitle]titleofbookat arbitrary depth, providedbookhas anauthorand adatewith sub-elementyear,andtitlehas asubtitle UDBS Part 4 - Winter 2001/2
Example 1 (a) Find all technical books at e-shopper‘s("selection" on the set of books) (b) Find books and videos from the same year at e-shopper‘s("join" of books and videos) UDBS Part 4 - Winter 2001/2
Example 1 (a) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:for-each select="//BOOK[@category]"> <xsl:if test="@category='technical'"> <TechnicalBook> <xsl:value-of select="TITLE"/> </TechnicalBook> </xsl:if> </xsl:for-each> </xsl:template> </xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Execution Sample command:saxon –o tb.xml catalog.xml technical-books.xsl Result: <?xml version="1.0" encoding="utf-8"?><TechnicalBook> Database System Concepts</TechnicalBook> <TechnicalBook> Data on the Web: From Relations to Semistructured Data and Xml</TechnicalBook> UDBS Part 4 - Winter 2001/2
Input Tree bookcatalog book book ISBN ISBN author title year author . . . . . edition title person publisher . . . . . UDBS Part 4 - Winter 2001/2
Result Tree <TechnicalBooks> TechnicalBook TechnicalBook Database System Concepts Data on the Web UDBS Part 4 - Winter 2001/2
XSL Language Elements • xsl:stylesheet • xsl:template • xsl:call-template • xsl:apply-templates • xsl:value-of • xsl:for-each • xsl:if, xsl:choose, xsl:when, xsl:otherwise • Many others controlstructures UDBS Part 4 - Winter 2001/2
Also . . . • Variables • Modi • Named templates UDBS Part 4 - Winter 2001/2
Example 1 (b) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:for-each select="//BOOK[YEAR]"> <xsl:variable name="book" select="."/> <xsl:for-each select="//VIDEO[YEAR=$book/YEAR]"> <xsl:variable name="video" select="."/> <BookAndVideoInYear> <Year><xsl:value-of select="$book/YEAR"/></Year> <Book><xsl:value-of select="$book/TITLE"/></Book> <Video><xsl:value-of select="$video/TITLE"/></Video> </BookAndVideoInYear> </xsl:for-each></xsl:for-each> </xsl:template></xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Execution Sample command:saxon –o sy.xml catalog.xml same-year.xsl Result: <?xml version="1.0" encoding="utf-8"?> <BookAndVideoInYear> <Year>2000</Year> <Book>Fermat‘s Last Theorem</Book> <Video>The Sixth Sense</Video> </BookAndVideoInYear> <BookAndVideoInYear> <Year>1999</Year> <Book> Data on the Web: From Relations to Semistructured Data and Xml</Book> <Video>Pippi Langstrumpf</Video> </BookAndVideoInYear> UDBS Part 4 - Winter 2001/2
e-shopper‘sheaven e-shopper‘s_heaven.com e-shopper‘s_heaven.com • How can e-shopper‘s make its product information available to the Web user ? UDBS Part 4 - Winter 2001/2
Example 2: XML to HTML Generate an HTML table (to be viewed in a browser) of books in the following form: Book Entries UDBS Part 4 - Winter 2001/2
Example 2 (cont‘d) <xsl:template match="/"><HTML><HEAD><TITLE>Book Entries</TITLE></HEAD><BODY> <xsl:apply-templates/></BODY></HTML> </xsl:template>creates the result document <xsl:template match="bookcatalog"><TABLE><TBODY><xsl:apply-templates/></TBODY></TABLE> </xsl:template>creates a table skeleton and calls the program recursively with sub-elements UDBS Part 4 - Winter 2001/2
Example 2 (cont‘d) <xsl:template match="book"><TR><xsl:apply-templates select="title"/> <xsl:apply-templates select="author"/></TR> </xsl:template> creates a table row <xsl:template match="title"><TD><xsl:value-of select="."/></TD> </xsl:template> <xsl:template match="author"><TD><xsl:value-of select="."/></TD> </xsl:template> creates a table entry UDBS Part 4 - Winter 2001/2
Complete Program <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <HTML><HEAD><TITLE>Book Entries</TITLE></HEAD> <BODY> <xsl:apply-templates/></BODY> </HTML> </xsl:template> <xsl:template match="BOOKCATALOG"> <TABLE><TBODY><xsl:apply-templates/></TBODY></TABLE> </xsl:template> <xsl:template match="BOOK"><TR><xsl:apply-templates select="TITLE"/> <xsl:apply-templates select="AUTHOR"/></TR> </xsl:template> <xsl:template match="TITLE"><TD><xsl:value-of select="."/></TD></xsl:template> <xsl:template match="AUTHOR"><TD><xsl:value-of select="."/></TD></xsl:template> </xsl:stylesheet> UDBS Part 4 - Winter 2001/2
Result <HTML><HEAD> <TITLE>Book Entries</TITLE></HEAD><BODY> <TABLE><TBODY> <TR><TD> title1 </TD><TD> author1 </TD> ... </TR> <TR><TD> title2 </TD><TD> author2 </TD> ... </TR> . . . . . </TBODY></TABLE></BODY> </HTML> UDBS Part 4 - Winter 2001/2
Example 3 <!DOCTYPE organization [<!ELEMENT organization (group+, topmgr)><!ELEMENT topmgr (employee+)><!ELEMENT group ((mgr, group+) | employee+)><!ELEMENT mgr (employee)><!ATTLIST employee id ID #REQUIRED><!ATTLIST group id ID #REQUIRED> ]> UDBS Part 4 - Winter 2001/2
Desired XSLT Program • Determine pairs (e1, e2) of employees such that e1 is a top manager <> Bill and a direct or indirect manager of e2 • Pairs are represented als element <pair> with attributes topmgrID and employeeID • Idea: compute a join between the list of top managers and the group managers UDBS Part 4 - Winter 2001/2
Example 3: Sample Document <organization><group id="HR"> <mgr><employee id="Bill"/></mgr> <group id="HR-prod"> <mgr><employee id="Edna"/></mgr> <group id="HR-prod-empl"> <employee id="Kate"/><employee id="Ronald"/> </group></group> <group id="HR-QA"> <mgr><employee id="John"/></mgr> <group id="HR-QA-empl"> <employee id="Jane"/><employee id="Jake"/> </group></group></group><topmgr><employee id="Bill"/><employee id="John"/></topmgr> </organization> UDBS Part 4 - Winter 2001/2
Sample Document as Tree topmanager group HR Bill John manager group HR-prod group HR-QA Bill manager manager KateRonald JaneJake Edna John UDBS Part 4 - Winter 2001/2
XSLT Program (1/3) <xsl:template match="/"> <xsl:apply-templates mode="start"/> </xsl:template> <xsl:template match="organization" mode="start"><result> <xsl:apply-templates select="/organization/topmgr/employee" mode="selecttopmgr"/></result> </xsl:template> UDBS Part 4 - Winter 2001/2
Idea (1) • Go into the initial state (Rule 1a), then start with applying Rule 1b to the root (in start mode) • Select each top manager, i.e., compare pattern /organization/topmgr/employee to the current organization node and select the employee children of topmgr UDBS Part 4 - Winter 2001/2
Idea (2) • Apply Rule 2 to each selected employee and store its ID e1 in varID • Verify that e1 <> Bill • If true, select all successors of a group manager with an ID of e1 (e1 is passed as parameter) UDBS Part 4 - Winter 2001/2
Select all employees who are successor of a group whose manager has the same ID as the one stored in varID XSLT-Programm (2/3) <xsl:template match="employee" mode="selecttopmgr"><xsl:variable name="varID"> <xsl:value-of select="@id"/></xsl:variable><xsl:if test="$varID != ´Bill´"> <xsl:apply-templates mode="display" select="//group[mgr/employee[@id=$varID]]//employee"> <xsl:with-param name="varID" select="$varID"/> </xsl:apply-templates></xsl:if> </xsl:template> UDBS Part 4 - Winter 2001/2
Idea (3) • For each selected employee e2 create as output a pair of values e1 and e2 for attributes topmgrID and employeeID • Note: this program can be improved UDBS Part 4 - Winter 2001/2
XSLT Program (3/3) <xsl:template match="employee" mode="display"><xsl:param name="varID"/><pair> <xsl:attribute name="topmgrID"> <xsl:value-of select="$varID"/> </xsl:attribute> <xsl:attribute name="employeeID"> <xsl:value-of select="@id"/> </xsl:attribute></pair> </xsl:template> UDBS Part 4 - Winter 2001/2
Discussion • Modes enable a different behaviour of an XSLT program when the same element is found repeatedly (e.g., to select or to show an employee) • Variables enable a computation of joins between data values • Value passing between rules is useful for joins as well as for groupings UDBS Part 4 - Winter 2001/2
Example 3: Result <result><pair topmgrID="John" employeeID="Jane"/><pair topmgrID="John" employeeID="Jake"/> </result> UDBS Part 4 - Winter 2001/2
Example 4: gcd (Euklid) gcd(a,b) repeat r := a mod b; a := b; b := runtil r = 0;write (a) Sample input document: <gcd><input> <no1>36</no1> <no2>21</no2> </input></gcd> Read input and start algorithm: <xsl:template match="input"> <xsl:call-template name="gcd"> <xsl:with-param name = "a" select="no1"></xsl:with-param> <xsl:with-param name = "b" select="no2"></xsl:with-param> </xsl:call-template> </xsl:template> UDBS Part 4 - Winter 2001/2
gcd (Euklid) <xsl:template name="gcd"> <xsl:param name="a"/><xsl:param name="b"/> <xsl:variable name="r" select="$a mod $b"/> <xsl:choose> <xsl:when test="$r = 0"> <result><xsl:value-of select="$b"/></result> </xsl:when> <xsl:otherwise> <xsl:call-template name="gcd"> <xsl:with-param name = "a" select="$b"></xsl:with-param> <xsl:with-param name = "b" select="$r"></xsl:with-param> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template> UDBS Part 4 - Winter 2001/2
Some Functions • +, - • div, * • mod • count • sum UDBS Part 4 - Winter 2001/2
XSL Summary • XML syntax • XML output • Language is "relationally complete” • The semantics is complex • Language can be used for querying XML documents UDBS Part 4 - Winter 2001/2
Stylesheets • For formatting documents • Use XSLT templates with elements like • xsl:import • xsl:include • xsl:output • xsl:preserve-space UDBS Part 4 - Winter 2001/2
XSL-FO • Formatting is done according to the area model (as in CSS) • A presentation area is decomposed into areas which can follow each other (block areas) or be nested into each other (inline areas) • Areas can have attributes, e.g., for determining point size and font, block margins UDBS Part 4 - Winter 2001/2