190 likes | 298 Views
CSE 190: Internet E-Commerce. Lecture 17: XML, XSL. Problem: Communicating data. Consider: A merchant, like Amazon, wants to publish its catalog of books on sale to a third party (e.g. a book club), so that party can create a newsletter using the data.
E N D
CSE 190: Internet E-Commerce Lecture 17: XML, XSL
Problem: Communicating data • Consider: A merchant, like Amazon, wants to publish its catalog of books on sale to a third party (e.g. a book club), so that party can create a newsletter using the data. • Question: What format does Amazon put that data in? How do we communicate that format to the third party?
Some Approaches • Define binary format based upon some structure. Distribute a Word document describing the layout of the fields.struct Book { char isbn[ 40 ]; char authorFirstName[ 60 ]; char authorLastName[ 60 ]; float salePrice; enum SHIPPING_AVAILABILITY { IN_STOCK, SOLD_OUT }; …};Limitation: Adding a new field means the parsing code must change; takes weeks for programmers to read the specification and write the parser; future applications like this require equal amounts of work.
Some Approaches 2. Output the data in HTML, send a Word document describing the layout of the data. Ex: <html><head></head><body> <table> <thead><tr><th>Title</th><th>ISBN</th><th>Author</th><th>Price</th></tr> <tr><td>Fast Food Nation</td><td>234234234</td><td>Eric Schlosser</td><td>17.50</td></tr>… Evaluation: + At least we can use the HTML parser to get the data • Still have to parse the resulting HTML tree to get data • Adding new data fields still implies writing more parse code
Approach: Common syntax with user defined tags • Use a strict syntax for tags, but allow any tag name to be used: <booklist> <book isbn=“0395977894”> <title>Fast Food Nation</title> <author>Eric Schlosser</author> <Price>17.50</Price> <book isbn=“1406066985”> …. Evaluation: + Directly expresses the data we wanted to communicate + May re-use the same parser for many different applications + Easy to ignore unfamiliar tags • Call it XML: Exchange Markup Language
What is XML? • XML: Used to communicate data to machines. • HTML: Used to communicate documents to people. • XML: An application of SGML (a document type description language) • Unlike HTML, you may invent your own tags • Uses a strict syntax so that XML parsers may be reused for many applications • Unknown tags and attributes are ignored • May be used to separate content from presentation by applying a transformation to the data to create a readable document. • Written in plain-text files; human-readable, not binary
XML Example • Show me the list of books on sale. <?xml version=“1.0” ?> <booklist> <book isbn=“234234234”> <title>Fast Food Nation</title> <author>Eric Schlosser</author> <price>17.50</price> <in_stock /> </book> <book isbn=“456465465”> <title>Religion Explained</title> <author>Pascal Boyer</author> <price>24.99</price> <out_of_stock /> </book> </booklist>
XML Example • Show me the grades of all students. <?xml version=“1.0”> <studentgrades> <student id=“A02111111”> <exam1>15</exam1> <exam2>15</exam2> <project>16</project> <final makeup=“yes”>14</final> </student> <student id=“A02222222”> <exam1>15</exam1> <exam2>15</exam2> <project>16</project> <final makeup=‘no’>15</final> </student> </studentgrades>
XML Syntax • Written in plain text (using notepad, e.g.) • All elements must have a closing tag • Ex: <p>Text text text</p> • Forbidden: <p>Line One<p>Line Two • Single element tags close themselves • Ex: <br /> • Not: <br>text text or <br />text text</br> • Should start with an XML document indicator • Ex: <?xml version=“1.0”?> • Has one root element, the first in the document • All attribute values must be single or double quoted • Ex: <boat_ride price=“15.00” /> • Not: <boat_ride price=15.00 /> • Not: <image_size width=100% /> • Tags are case sensitive • Ex: <author>Eric Schlosser</author> • Not: <author>Eric Schlosser</Author> • Mime Type: text/xml
Parsing • An XML parser may parse any XML document • Well-formed: XML document follows the XML syntax rules • Valid: XML document conforms to its specified DTD (Document Type Definition) <?xml version="1.0“ ?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Manny</to> <from>Janni</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
DTDs • DTDs: Describe the allowable tags and attributes used in an XML document <!DOCTYPE TVSCHEDULE [<!ELEMENT TVSCHEDULE (CHANNEL+)> <!ELEMENT CHANNEL (BANNER,DAY+)> <!ELEMENT BANNER (#PCDATA)> <!ELEMENT DAY (DATE,(HOLIDAY|PROGRAMSLOT+))+> <!ELEMENT HOLIDAY (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)> <!ELEMENT TIME (#PCDATA)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT DESCRIPTION (#PCDATA)> <!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED> <!ATTLIST CHANNEL CHAN CDATA #REQUIRED> <!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED> <!ATTLIST TITLE RATING CDATA #IMPLIED> <!ATTLIST TITLE LANGUAGE CDATA #IMPLIED> ]> • Written in SGML
Namespaces • Question: How do we create a document using tags from two different DTDs? • We can just combine the two, using the tags from both DTDs • Problem: What if both DTDs are using the same tag name to refer to different items? • Solution: We can specify what namespace a tag is coming from. • Namespaces: Specify the DTD that a tag name is relative to • Example: <?xml version=“1.0”?> <amzn:booklist xmlns:amzn=“http://www.amazon.com/deals/BL”> <amzn:book isbn=“23423422”> <amzn:author>Eric Schlosser</amzn:author> <loc:category xmlns:loc=“http://www.libraryofcongress.org/classify”> Food, muckraking</loc:category> <amzn:price>17.50</amzn:price> </amzn:book> </amzn:booklist>
Viewing XML • XML documents may be viewed natively in IE, Netscape • Ex: testdoc.xml • What if we’d like to present the data to people, in a more readable way? • Useful for: creating data documents, then a clearly separated way for viewing the data • XML Transformation into a view • CSS • XSLT
CSS • CSS: Cascading Style Sheets • Applied to XML, similar to HTML • Example (note the tags in the selector): CATALOG { background-color: #ffffff; width: 100%; } CD { display: block; margin-bottom: 30pt; margin-left: 0; } TITLE { color: #FF0000; font-size: 20pt; } ARTIST { color: #0000FF; font-size: 20pt; } COUNTRY,PRICE,YEAR,COMPANY { display: block; color: #000000; margin-left: 20pt; } • Limitation: Not all browser versions support XML + CSS display
XSL • XSL: XML Style Language • Provides declarative method for transform XML document into a viewable format (usually HTML)
XSL Example <?xml version="1.0“ ?> <xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <table border="1"> <tr> <th>Title</th> <th>Artist</th> </tr> <xsl:for-each select="catalog/cd"> <tr> <td><xsl:value-of select="title" /></td> <td><xsl:value-of select="artist" /></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
XSL Elements • xsl:template • Replaces all elements matching the Xpath expression (e.g. “/” matches all elements) with the specified content • xsl:for-each • For each element matching the select Xpath expression, the element is replaced with the specified content • xsl:value-of • Is replaced with the text value of the element specified by select’s Xpath expression • Xpath: A pattern language for XML • / -> matches all elements • a/b -> matches all A elements with children of tag B
XSL, Server-Side • XML file may be made human-friendly by specifying style sheet from within the XML document • Uses client-side support to transform into HTML • What if the browser doesn’t support this? • The web server may do the transformation on the server side, giving the browser the resulting HTML. • May decide to detect the browser version before deciding on what method to use
References • Book: Step by Step XML by Michael Young • URL: http://www.w3schools.com/xml/ • URL: http://www.w3schools.com/xsl/