1.06k likes | 1.07k Views
Learn how to use XSLT and XPath to enhance HTML documents by extracting data from XML documents and customizing the HTML content. Includes examples and step-by-step instructions.
E N D
Using XSLT and XPath to Enhance HTML Documents Roger L. Costello XML Technologies
Acknowledgement • I wish to thank David Jacobs for showing me a new way of looking at HTML and XSLT/XPath • Many of the examples that I use in this tutorial come straight from David's excellent paper, Rescuing XSLT from Niche Status (see http://www.xfront.com)
History XSL (low-precision graphics, e.g.,HTML, text, XML) XQuery (high-precision graphics, e.g., PDF) XLink/ XPointer XSL XSLT XMLSchemas XSLT XPath
Note • For brevity, instead of using the term XSLT/XPath, I will simply call it XSL.
Multiple Output Formats • XSL may be used to generate either HTML, XML, or text XSL XSL Processor XML HTML (or XML or text)
xalan/xt/saxon • xalan: A free XSL processor, implemented in Java, from Apache (http://www.apache.org/) • xt: A free XSL processor, implemented in Java, from James Clark (http://www.jclark.com/) • saxon: A free XSL processor, implemented in Java, from Michael Kay (http://users.iclway.co.uk/mhkay/saxon XML XSL xalan/xt/saxon HTML (or XML or text) Invoking from a DOS command line: run-xalan FitnessCenter.xml FitnessCenter.xsl FitnessCenter.html run-xt FitnessCenter.xml FitnessCenter.xsl FitnessCenter.html run-saxon FitnessCenter.xml FitnessCenter.xsl FitnessCenter.html
Styling XML Documents using IE6 or Netscape7 • Put a stylesheet PI at the top of your XML document. • Now you can simply drop the XML document into the browser and the XML will be automatically styled using the stylesheet referenced in the stylesheet PI. <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="FitnessCenter.xsl"?> <FitnessCenter> <Member level="platinum"> <Name>Jeff</Name> <Phone type="home">555-1234</Phone> <Phone type="work">555-4321</Phone> <FavoriteColor>lightgrey</FavoriteColor> </Member> </FitnessCenter> Add this stylesheet PI to the top of your XML document
HTML Generation • We will first use XSL to generate HTML documents • When generating HTML, XSL should be viewed as a tool to enhance HTML documents. • That is, the HTML documents may be enhanced by extracting data out of XML documents • XSL provides elements (tags) for extracting the XML data, thus allowing us to enhance HTML documents with data from an XML document
Enhancing HTML Documents with XML Data HTML Document (with embedded XSL elements) XML Document XSL element XSL Processor XML data XML data
Enhancing HTML Documents with the Following XML Data <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="FitnessCenter.xsl"?> <FitnessCenter> <Member level="platinum"> <Name>Jeff</Name> <Phone type="home">555-1234</Phone> <Phone type="work">555-4321</Phone> <FavoriteColor>lightgrey</FavoriteColor> </Member> </FitnessCenter> FitnessCenter.xml
Embed HTML Document in an XSL Template <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY> Welcome! </BODY> </HTML> </xsl:template> </xsl:stylesheet> Note how we have the HTML document embedded within an XSL template FitnessCenter.xsl (see html-example01)
Note • The HTML is embedded within an XSL template, which is an XML document • Consequently, the HTML must be well formed, i.e., every start tag must have an end tag • Because the HTML is embedded within an XSL template, we are able to add XSL elements to the HTML, allowing us to extract data out of XML documents • Let's customize the HTML welcome page by putting in the member's name. This is achieved by extracting the name from the XML document. We use an XSL element to do this.
Extracting the Member Name <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! </BODY> </HTML> </xsl:template> </xsl:stylesheet> (see html-example02)
Note • Notice how we have enhanced the HTML document by using data from the XML document!
Extracting a Value from an XML Document,Navigating the XML Document • Extracting values: • use the <xsl:value-of select="…"/> XSL element • Navigating: • The slash ("/") indicates parent/child relationship • A slash at the beginning of the path indicates that it is an absolute path, starting from the top of the XML document /FitnessCenter/Member/Name "Start from the top of the XML document, go to the FitnessCenter element, from there go to the Member element, and from there go to the Name element."
Document / PI <?xml version=“1.0”?> Element FitnessCenter Element Member Element Phone Element Name Element FavoriteColor Element Phone Text 555-4321 Text Jeff Text lightgrey Text 555-1234
Extract the FavoriteColor and use it as the bgcolor <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! </BODY> </HTML> </xsl:template> </xsl:stylesheet> (see html-example03)
Note Attribute values cannot contain "<" nor ">" - Consequently, the following is NOT valid: <Body bgcolor="<xsl:value-of select='/FitnessCenter/Member/FavoriteColor'/>"> To extract the value of an XML element and use it as an attribute value you must use curly braces: <Body bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Evaluate the expression within the curly braces. Assign the value to the attribute.
Extract the Home Phone Number <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! <BR/> Your home phone number is: <xsl:value-of select="/FitnessCenter/Member/Phone[@type='home']"/> </BODY> </HTML> </xsl:template> </xsl:stylesheet> (see html-example04)
Note In this example we want "the Phone element where the value of its type attribute equals 'home' ": <xsl:value-of select="/FitnessCenter/Member/Phone[@type='home']"/> The expression within […] is called a "predicate". Its purpose is to filter. Note the use of the single quotes within the double quotes. select=" … ' …' …"
Review - HTML Table <table border=“1” width=“100%”> <th> </th> <th> </th> <th> </th> </tr> <tr> <td> </td> <td> </td> <td> </td> </tr> <tr> <td> </td> <td> </td> <td> </td> <tr> </tr> </table> This will create a table with 3 rows - the first row contains a header for each column. The next two rows contains the table data.
<table border=“1” width=“75%”> <tr> <th>Fruit</th> <th>Color</th> </tr> <tr> <td>Papaya</td> <td>Red</td> </tr> <tr> <td>Banana</td> <td>Yellow</td> </tr> </table> Fruit Color Red Papaya Yellow Banana
Create a Table of Phone Numbers • Suppose that a Member has an arbitrary number of phone numbers (home, work, cell, etc). • Create an HTML table comprised of the phone numbers. On each row of the table put the type (home, work, cell, etc) in one column and the actual phone number in the next column.
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! <BR/> Your phone numbers are: <TABLE border="1" width="25%"> <TR><TH>Type</TH><TH>Number</TH></TR> <xsl:for-each select="/FitnessCenter/Member/Phone"> <TR> <TD><xsl:value-of select="@type"/></TD> <TD><xsl:value-of select="."/></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> </xsl:template> </xsl:stylesheet> (see html-example05)
Iterating through XML Elements <xsl:for-each select="/FitnessCenter/Member/Phone"> <!- - Within here we are at one of the Phone elements. Thus, in <xsl:value-of select="path", the value for path is relative to where we are in the XML document. The "." refers to the Phone element that we are currently positioned at. - -> </xsl:for-each>
Absolute Path versus Relative Path <xsl:value-of select="/FitnessCenter/Member/Phone[@type='home']"/> This is an absolute xPath expression (we start from the top of the XML tree and navigate down the tree) <xsl:value-of select="@type"/> This is a relative xPath expression (relative to where we currently are located, give me the value of the type attribute) Do Lab1, Parts 1-3
Special Offer to Platinum Members • Let's further enhance our example to provide a special offer to "platinum" members. • We need to check to see if the "level" attribute on the Member element equals "platinum".
<HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! <BR/> <xsl:if test="/FitnessCenter/Member/@level='platinum'"> Our special offer to platinum members today is ... <BR/> </xsl:if> Your phone numbers are: <TABLE border="1" width="25%"> <TR><TH>Type</TH><TH>Number</TH></TR> <xsl:for-each select="/FitnessCenter/Member/Phone"> <TR> <TD><xsl:value-of select="@type"/></TD> <TD><xsl:value-of select="."/></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> (see html-example06)
Conditional Processing • Use the <xsl:if test="…"/> element to perform conditional processing. Do Lab1, Part 4
Accessing Multiple Parts of the XML Document • Let's enhance the table to contain three columns - the name of the Member, the type of the phone (home, work, cell, etc), and the actual phone number.
<HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY bgcolor="{/FitnessCenter/Member/FavoriteColor}"> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! <BR/> <xsl:if test="/FitnessCenter/Member/@level='platinum'"> Our special offer to platinum members today is ... <BR/> </xsl:if> Your phone numbers are: <TABLE border="1" width="25%"> <TR><TH>Name</TH><TH>Type</TH><TH>Number</TH></TR> <xsl:for-each select="/FitnessCenter/Member/Phone"> <TR> <TD><xsl:value-of select="../Name"/></TD> <TD><xsl:value-of select="@type"/></TD> <TD><xsl:value-of select="."/></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> (see html-example07)
Phone Name Phone 555-4321 555-1234 Jeff Getting the Name when accessing the Phone Member Notice how when in the for-each loop we need to access the Name which is "up and over" with respect to the Phone element Bottom line: we can access elements in other parts of the XML tree via the “../” operator.
Other ways to Access the XML Data Note: Assume that there are multiple Members <xsl:value-of select="/FitnessCenter/Member[1]/Name"/> "Select the Name of the first Member" <xsl:value-of select="/FitnessCenter/Member[position()=1]/Name"/> "Select the Name of the first Member" <xsl:value-of select="/FitnessCenter/Member[last()]/Name"/> "Select the Name of the last Member" <xsl:for-each select="/FitnessCenter/Member[not(position()=last())]"> <!- - Process all Members but the last - -> </xsl:for-each>
Other ways to Access the XML Data (cont.) <xsl:for-each select="/FitnessCenter/Member[position() != last())]"> <!- - Process all Members but the last - -> </xsl:for-each> <xsl:for-each select="/FitnessCenter/Member[position() >1]"> <!- - Process all Members but the first - -> </xsl:for-each> <xsl:for-each select="/FitnessCenter//Name"> <!- - Process all Name elements which have FitnessCenter as an ancestor - -> </xsl:for-each>
Other ways to Access the XML Data (cont.) <!- - Iterate through a list of Member nodes - -> <xsl:for-each select="/FitnessCenter/Member"> <!- - Output the Name of the "current" Member - -> <xsl:value-of select="./Name"/> </xsl:for-each> <!- - Since a specific Member is not specified, all Member nodes - -> <!- - all selected. That is, the Name node within each Member - -> <!-- node is selected. --> <xsl:for-each select="/FitnessCenter/Member/Name"> <xsl:value-of select="."/> </xsl:for-each>
Nodelist This xPath expression: /FitnessCenter/Member selects a list of nodes (a list of Member nodes). This list of nodes is called a "nodelist".
Enhanced XML Document <?xml version="1.0"?> <FitnessCenter> <Member id="1" level="platinum"> <Name>Jeff</Name> <Phone type="home">555-1234</Phone> <Phone type="work">555-4321</Phone> <FavoriteColor>lightgrey</FavoriteColor> </Member> <Member id="2" level="gold"> <Name>David</Name> <Phone type="home">383-1234</Phone> <Phone type="work">383-4321</Phone> <FavoriteColor>lightblue</FavoriteColor> </Member> <Member id="3" level="platinum"> <Name>Roger</Name> <Phone type="home">888-1234</Phone> <Phone type="work">888-4321</Phone> <FavoriteColor>lightyellow</FavoriteColor> </Member> </FitnessCenter> Note that each Member now has a unique id (the id attribute)
Review - HTML Hyperlinking <A name="AnnaAndTheKing"></A> … <A href="#AnnaAndTheKing">Click Here</A> ... This creates an internal hyperlink (the source "anchor" links to the target anchor).
Hyperlink Name to Home Phone • Problem: create an HTML document that has two tables - a Member Name table, and a Member home Phone number table. • Hyperlink the Member's Name to his/her Phone.
<TABLE border="1" width="25%"> <TR><TH>Name</TH></TR> <xsl:for-each select="/FitnessCenter/Member"> <TR> <TD> <A href="#{@id}"> <xsl:value-of select="Name"/> </A> </TD> </TR> </xsl:for-each> </TABLE> <BR/><BR/><BR/><BR/><BR/> <TABLE border="1" width="25%"> <TR><TH>Home Phone Number</TH></TR> <xsl:for-each select="/FitnessCenter/Member"> <TR> <TD> <A name="{@id}"> <xsl:value-of select="Phone[@type='home']"/> </A> </TD> </TR> </xsl:for-each> </TABLE> (see html-example08) Do Lab1, Parts 5-6
Numbering • There is an XSL element that returns a number corresponding to the element's position in the set of selected nodes <xsl:for-each select="/FitnessCenter/Member"> <xsl:number value="position()" format="1"/> <xsl:text>. </xsl:text> <xsl:value-of select="Name"/> <BR/> </xsl:for-each> Output: 1. Jeff 2. David 3. Roger (see html-example09)
Start Numbering from 0 • How would you start the numbering from zero, rather than one? <xsl:number value="position() - 1" format="1">
format attribute of xsl:number • In the previous example we saw how to generate numbers, and we saw that the generated numbers were 1, 2, 3, etc. • With the format attribute we can specify the format of the generated number, i.e., 1, 2, 3 or I, II, III, or A, B, C, or … • format=“1” generates the sequence: 1, 2, 3, … • format=“01” generates: 01, 02, 03, … • format=“A” generates: A, B, C, … • format=“a” generates: a, b, c, … • format=“I” generates: I, II, III, … • format=“i” generates: i, ii, iii, ...
format attribute of xsl:number <xsl:for-each select="/FitnessCenter/Member"> <xsl:number value="position()" format="A"/> <xsl:text>. </xsl:text> <xsl:value-of select="Name"/> <BR/> </xsl:for-each> Output: A. Jeff B. David C. Roger
Sorting • There is an XSL element that sorts the elements that you extract from the XML document <xsl:for-each select="/FitnessCenter/Member"> <xsl:sort select="Name" order="ascending"/> <xsl:value-of select="Name"/> <BR/> </xsl:for-each> Output: David Jeff Roger (see html-example10) "For each Member, sort the Name elements"
Sorting <xsl:for-each select="/FitnessCenter/Member"> <xsl:sort select="Name" order="ascending"/> <xsl:value-of select="Name"/> <BR/> </xsl:for-each> The set of Member elements selected by xsl:for-each is sorted using the Name child element. This occurs prior to the first iteration of the loop. After the set of Member elements are sorted then the looping begins.
concat() function • concat(destination string, string to add) • Note: if you want to concatenate more than one string to the destination string then simply add more arguments
<xsl:for-each select="/FitnessCenter/Member"> <xsl:value-of select="concat('Welcome ', Name, '!')"/> <BR/> </xsl:for-each> Output: Welcome Jeff! Welcome David! Welcome Roger!
xsl:variable • This XSL element allows you to create a variable to hold a value (which could be a string or a subtree of the XML document). • The variable is referenced by $variable-name <xsl:variable name=“hello” select=“'Hello World'”/> This creates a variable called hello, that has a value which is the literal string, ‘Hello World’. We could use this variable as follows: Value = <xsl:value-of select=“$hello”/> This will output: Value = Hello World hello Hello World
Member's Phone Numbers: <TABLE border="1" width="25%"> <TR><TH>Name</TH><TH>Type</TH><TH>Number</TH></TR> <xsl:for-each select="/FitnessCenter/Member"> <xsl:variable name="name" select="Name"/> <xsl:for-each select="Phone"> <TR> <TD><xsl:value-of select="$name"/></TD> <TD><xsl:value-of select="@type"/></TD> <TD><xsl:value-of select="."/></TD> </TR> </xsl:for-each> </xsl:for-each> </TABLE> (see html-example12)