320 likes | 428 Views
XML (I). COSC 643 Sungchul Hong. XML. eXtensible Markup Language XML offers a unique combination of flexibility, simplicity, and readability by both humans and machines. Format and converting formats Standard Generalized Markup Language (SGML) predecessor. XML and HTLM. HTML
E N D
XML (I) COSC 643 Sungchul Hong
XML • eXtensible Markup Language • XML offers a unique combination of flexibility, simplicity, and readability by both humans and machines. • Format and converting formats • Standard Generalized Markup Language (SGML) predecessor
XML and HTLM • HTML • A presentation based markup language • Has a fixed set of tags • XML • A domain-based markup language • Has no fixed set of tags • extensible
XML and HTLM • SGML descendents • XML • HTML • HTML’s syntax has always been looser and more forgiving • Cascading Style Sheets Level 1 specification
MSXML3 • http://www.microsoft.com/msdownload • msxml3sp2Setup.exe • Install • Testing
test.xml <?xml version=‘1.0’ ?> <?xml:stylesheet type=“text/xsl” herf=“test.xsl”?> <document> <message> It worked! </message> </document>
test.xsl <?xml version=‘1.0’ ?> <?xslstylesheet version”1.0” xmlns:xsl=http://www.w3.org/1999/XSL/Transform> <xsl:template match=“/”> <html> <body> <h1><xsl:value-of select=“//message” /></h1> </body> </html> <xsl:template> </sxl:stylesheet>
CSS • <HTML> • <HEAD><TITLE>Formatting with CSS, modifying standard HTML</TITLE> • <STYLE TYPE="text/css"> • H1 {font-family: Arial, Helvetica; font-weight: bold; font-size: 24pt} • EM {font-weight: bold; font-style: normal} • CITE {font-style: italic} • VAR {font-family: Courier; font-weight: bold} • CODE {font-family: Courier} LI {font-family: Arial, Helvetica} • </STYLE> • </HEAD>
CSS • <BODY> • <H1>Introduction to HTML</H1> • <P>This page has been created purely with logical tags. No additional formatting has been specified by the designers.</P> • <P>While it might be nice to specify text like we could in QuarkXPress, we'll settle for applying <EM>emphasis</EM> where appropriate, <CITE>citations</CITE> when necessary, and maybe highlight a <VAR>variable</VAR> along the way. We can also indicate code listings:</P>
CSS • <CODE> 10 PRINT "HELLO WORLD"<BR> 20 END<BR> </CODE> <P>Bulleted lists are easy too:</P> • <UL> <LI>HTML Structures</LI> <LI>CSS Structures</LI> <LI>XML Structures</LI> • </UL> • <P>Numbered and lettered lists are also fun:</P> <OL> <LI>Item #1</LI> <LI>Item #2</LI> </OL> • </BODY> • </HTML>
XSL • eXstensible Style Language • XSL goes beyond CSS by creating formatting structures for documents as well as elements. • XSL allows developers to create styles that take into account (or even modify) an element’s position in a document, its ancestry (by which other elements it is contained), and its uniqueness.
XML Parsers • Used to extract or analyze the data in XML documents • Analyzes syntax of input XML documents • Passes results of analysis to applications using event callbacks • Reports errors and warnings discovered
XML Parsers • Simple API for XML (SAX) • No modification of the document • Fastest and least memory intensive • Sequential access • Document Object Model (DOM) • Memory intensive • Allows modification • Tree structure
What Can XML Be Used For? • Exchanging information between applications • Sharing data between distributed components • B2B communications (with XSL, XSLT) • Crating separation of presentation from content • Defining configuration information
XML Characteristics • XML documents contain a hierarchy of tags • The tag structure is kike HTML • XML is case-sensitive • XML is a superset of HTML • An HTML file is really just an XML document.
XML Elements • Basic components of XML documents • Elements must start with a letter, underscore or colon • Encapsulate element content, usually composed of: • Other elements • Character data • Entity references • Delimited using tags
XML Elements • All elements must have a start-tag and an end-tag. • Elements can optionally have attributes • Empty elements can use an abbreviated element form.
<ITEM><PRODNAME>Jimbo‘s Super Clock</PRODNAME>: <PART>SC45-A</PART> <PRICE>$199.95</PRICE> (<AIRF>$19.95</AIRF> freight/air, <GROUNDF>$7.95</GROUNDF> ground) <WARRANTY>Twenty- five year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> <ITEM><PRODNAME>Lamp Controller</PRODNAME>: <PART>LC45-X</PART> <PRICE>$25.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>Ten- year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> • <ITEM><PRODNAME>Electroshock Clips</PRODNAME>: <PART>ES45-L</PART> <PRICE>$59.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>One- year</WARRANTY> Warranty. Made in <ORIGIN>USA</ORIGIN></ITEM>
The Document Entity And Document Element • An XML document has one and only one document entity. • The document entity consists of • Processing elements • Comments • The document element • The document element is the parent of all other elements in the XML document • The Document element cannot be contained in any other elements.
Element Nesting • All elements must be nested properly • No cross overlapping • HTML allows overlapping. <ITEM><PRODNAME>Jimbo‘s Super Clock</PRODNAME> </ITEM>
XML Namespaces • XML Namespaces allow a prefix to be associated with an element to avoid name collisions • XML Namespaces are a W3C specification • A unique URI must be used with a prefix to denote elements in this namespace from other namespaces. • The URI is only for distinguishing prefixes. It is not actually resolved • Namespaces use the reserved word xmlns
Example • <?xml version = “1.0” ?> • <JU : LunchMenu xmlns : JU=“http://catering.com/JU”> • <JU : Maincours>Hamburger </JU : MainCourse> • <JU : Sidedish>French Fries</JU : Sidedish> • <JU : togo/> • </JU : LunchMenu>
Character Data • Character data is defined as any text that is not markup. • Character data • The textual content inside elements • The value of an attribute • A string literal • “&” and “<“ can not be contained inside character data. (entity reference) • <math> 1 < 2 & 2 < 3</math> • “1 < 2 & 2 < 3”
Attributes • Elements can contain attributes to provide information about the element • Attributes are often used to convey information to an XML application • Attributes are not considered part of an element’s content • Attributes must be string literals • Attributes are not part of the presentation to an end user, though they may be used to affect the presentation.
Example • <Dessert type = “Lowfat”>Cheesecake</Dessert> • </Moter cylinders = “6”>
White Space • XML defines white space • Horizontal tab • Line feed • Carriage return • Space • All end of line characters are converted to line feed characters by parsers.
Entity References • < (<) • & (&) • > (>) • ' (‘) • " (“)
Processing Instructions • Processing instructions allow non-content information to be sent from the parser to an application • Processing instructions use the following syntax: <?target instructions ?> • Any PI that starts with xml – is designed to communicate with an XML-specific technology • PIs can be used to communicate information to an XSL processor, for example • <?xml – stylesheet href=“MyXSL.xsl” type= “text/xml”?>
Comments • <!– comment text -- >
CDATA Section • CDATA sections are useful when there is content that would require a lot of escape characters • CDATA sections can be used anywhere regular character data can be used • An XML parser will not attempt to process any data in a CDATA section • CDATA syntax: • <! [ CDATA [ 1 < 2 & e < 3 ]]>