420 likes | 436 Views
XML - Extensible Markup Language. HTML - Hypertext Markup Language. HTML has a fixed tag set. Use these tags to describe how information is to be presented. e.g. <H1>My Header</H1> Example - next slide. <HTML> <HEAD> <TITLE>My HTML Table</TITLE> </HEAD> <BODY>
E N D
HTML - Hypertext Markup Language • HTML has a fixed tag set. • Use these tags to describe how information is to be presented. • e.g. <H1>My Header</H1> • Example - next slide
<HTML> <HEAD> <TITLE>My HTML Table</TITLE> </HEAD> <BODY> <H1><CENTER>My HTML Table</CENTER></H1> <Table border> <TR> <TH>name</TH> <TH>E-mail address</TH> </TR> <TR> <TD>Jessica</FONT></TD> <TD>jessica@blue.weeg.uiowa.edu</TR> </TR> <TR> <TD>Peter</TD> <TD>peter@icaen.uiowa.edu</TD> </TR> <TR> <TD>Alen</FONT></TD> <TD>alen@yahoo.com</FONT></TD> </TR> </Table> </BODY> </HTML>
HTML Example - Advantages • It's fairly readable. • The HTML can be displayed by just about any HTML browser, • HOWEVER -- The meaning of the various pieces of data in the document is lost.
HTML: Not Suitable for Powerful Information Systems • HTML isn't extensible • HTML is very display-centric • HTML isn't usually directly reusable • HTML only provides one 'view' of data • HTML has little or no semantic structure How do we get a language that's roughly as easy to use as HTML but has none of these disadvantages?
XML - Extensible Markup Language • XML, unlike HTML, is really a meta-language for describing markup languages. That is, XML provides a facility to define tags and the structural relationships between these tags. • User-defined tags • Defines "what" is in the data instead of how the data to be presented. • Why is XML useful? • Neutral Source of data (text format, structured) • Easy data format transformation
XML and E-Commerce • Has the potential for transferring data between enterprises with different computer and DB systems. • Fundamental to e-commerce. A B XML Conversion Conversion
XML Notation • Meaning of data is clearly stated • No display information • Data in tags:
XML Example <?xml version="1.0"?> <XML ID="XMLID"> <class> <student ID="0001" > <name>Jessica</name> <email>jessica@blue.weeg.uiowa.edu</email> </student> <student ID="0002" > <name>Peter</name> <email>peter@icaen.uiowa.edu</email> </student> </class> </XML>
Creating XML • Can use text editor • Save file as *.xml • Can view in browser or use the data
XML Document must be Well-Formed • No unclosed tags • Every start tag must have a corresponding end tag. • No overlapping tags <Tomato> Let's call <Potato>the whole thing off</Tomato> </Potato> • Attribute values must be enclosed in quotes. • Parsers can check for this. Can also have parsers that validate that the structure and number of tags make sense.
XML • Straightforward concept. • Can be used: • View data in a browser • Can alter appearance with Cascading Style Sheets (CSS) or with Extensible Stylesheet Language (XSL) • Manipulate data • Placed in an application e.g. database, spreadsheet
Cascading Style Sheets • Have file containing formatting information (can also include in HTML file). • Example (see next page for details): Mylabel_1 {display: block;} Mylabel_2 {display: block;} Mylabel_3 {display: block;} Mylabel_4 {display: block;} Save as a file named "mycssadv.css" in the same directory of the xml file Add this line to the XML file below the <?xml version="1.0"?> statement: <?xml-stylesheet type="text/css" href="mycssadv.css" ?>
class { display: block; border: 2px solid black; padding: 1em; background-color: #888833; color: #FFFFDD; font-family: Times, serif; font-style: italic; text-align: center; } student { display: block; border: 2px solid black; padding: 1em; background-color: #008833; color: #FFFFFF; font-family: Times, serif; font-style: italic; text-align: center; } name { display: block; border: 0px solid black; padding: 1em; background-color: #008833; color: #FFDDAA; font-family: Times, serif; font-style: italic; text-align: center; } email {display: block;}
CSS - Detailed Information http://www.w3.org/TR/REC-CSS1
CSS • Cascading style sheets • styles (like fonts, colors, and so on) for one markup element "cascade" down, and apply to all of the element's contents. • Can be a separate file. • Limited to “styling” a file.
XSL (Extensible Stylesheet Language) XSL can: • Specify display characteristics • Convert XML through querying, sorting and filtering
XML and XSL BROWSER XML XSL
Example XSL file (using an HTML template) <?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <HTML> <BODY> <TABLE BORDER="2"> <TR> <TD>Name</TD> <TD>E-mail Address</TD> </TR> <xsl:for-each select="XML/class/student" > <TR> <TD><xsl:value-of select="name" /></TD> <TD><xsl:value-of select="email" /></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> </xsl:template> </xsl:stylesheet>
Advanced Features: Sorting with an XSL file • Modify the xsl file: <xsl:for-each select="XML/class/student" order-by="email">
XML and XSL BROWSER XML XSL
XSL: Extended Style Language • Can style a file • Can also restructure a file • See example • Builds an html page from XML code • XSL's design also includes embedded scripting (JavaScript) - limited implementation of this
Modeling information structure in XML • XML forms a structure. • The Document Object Model (DOM) Level 1 Recommendation describes a set of language-neutral interfaces capable of representing any well-formed XML or HTML document.
DOM • Use of DOM allows program or script to use the XML or HTML • Dynamic HTML is an example • The DOM opens the door to using XML as the lingua franca of data interchange on the Internet, and even within applications.
SAX • The easiest way to process an XML file in Java is by using the Simple API for XML, or SAX. • SAX is a simple Java interface that many Java parsers can use. • A SAX parser is a class that implements the interface org.xml.sax.Parser • This parser "walks" the tree of document nodes in an XML file, calling the methods of user-defined handler classes.
Using XML A B XML Conversion/ Program Conversion/ Program • Can present data differently • Can add to database • Can have program manipulate data
Further XML Information http://msdn.microsoft.com/xml/default.asp
Document Type Definition • While a well-formed document is well-formed because it follows rules defined by the XML spec, a valid document is valid because it matches its document type definition (DTD). • The DTD is the grammar for a markup language, defined by the designer of the markup language. • The DTD specifies: • what elements may exist, • what attributes the elements may have, • what elements may or must be found inside other elements, and in what order.
DTD • The DTD defines the document type. • DTDs currently are being written for an enormous number of different problem domains, and each DTD defines a new markup language. • Determining a DTD is an essential step in using XML for e-commerce
See readings for explanation <!ELEMENT Recipe (Name, Description?, Ingredients?, Instructions?)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Description (#PCDATA)> <!ELEMENT Ingredients (Ingredient)*> <!ELEMENT Ingredient (Qty, Item)> <!ELEMENT Qty (#PCDATA)> <!ATTLIST Qty unit CDATA #REQUIRED> <!ELEMENT Item (#PCDATA)> <!ATTLIST Item optional CDATA "0" isVegetarian CDATA "true"> <!ELEMENT Instructions (Step)+>
DTD • Parser can check if XML code is following DTD. • A DTD is associated with an XML document by way of a document type declaration, which appears at the top the XML file (after the <?xml...?> line). • For example: <!DOCTYPE Recipe SYSTEM "example.dtd">
XML Schema • defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes From http://www.w3schools.com/schema/schema_intro.asp
XML Schemas are the Successors of DTDs It is thought that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons: • XML Schemas are extensible to future additions • XML Schemas are richer and more useful than DTDs • XML Schemas are written in XML • XML Schemas support data types • XML Schemas support namespaces From http://www.w3schools.com/schema/schema_intro.asp
XML Schema has Support for Data Types One of the greatest strengths of XML Schemas is the support for data types. With the support for data types: • It is easier to describe permissible document content • It is easier to validate the correctness of data • It is easier to work with data from a database • It is easier to define data facets (restrictions on data) • It is easier to define data patterns (data formats) • It is easier to convert data between different data types From http://www.w3schools.com/schema/schema_intro.asp
XML Schemas Secure Data Communication • When data is sent from a sender to a receiver it is essential that both parts have the same "expectations" about the content. • With XML Schemas, the sender can describe the data in a way that the receiver will understand. • A date like this: "03-11-2004" will, in some countries, be interpreted as 3. November and in other countries as 11. March, but an XML element with a data type like this: • <date type="date">2004-03-11</date> • ensures a mutual understanding of the content because the XML data type date requires the format YYYY-MM-DD. From http://www.w3schools.com/schema/schema_intro.asp
Schemas and DTD • Short term DTDs have advantages: • Widespread tools support. All SGML tools and many XML tools can process DTDs. • Widespread deployment. A large number of document types are already defined using DTDs: HTML, XHTML, DocBook, TEI, J2008, CALS, etc. • Widespread expertise and many years of practical application.
Schemas and DTDs • DTDs: • They are written in a different (non-XML) syntax. • They have no support for namespaces. • They only offer extremely limited datatyping. No facility for describing numbers, dates, currency values, and so forth. Furthermore, DTDs have no ability to express the datatype of character data in elements. • They have a complex and fragile extension mechanism based on little more than string substitution.
Benefits of Representing Information in XML • XML is at least as readable as HTML and probably more so. • The tags don't have anything to do with how the document is displayed. • Separation of content and presentation is a key concept inherited from SGML. • XML is more versatile than HTML • A lot of the programming is already done for you • If you write a DTD and use a validating parser, much of the error checking for the validity of your input is done by the parser. There's no need to write the parser yourself, since there are so many high-quality parsers available for free.