370 likes | 466 Views
XML: The Promise and the Reality. K. Scott Morrison IBM Pacific Development Centre Vancouver. And clear the standards log jam!. XML Hype. XML will replace HTML. XML is for documents. XML is for data. XML will replace all message formats. Profile Edit Screen NAME: K. Scott Morrison
E N D
XML: The Promise and the Reality K. Scott Morrison IBM Pacific Development Centre Vancouver
And clear the standards log jam! XML Hype XML will replace HTML XML is for documents XML is for data XML will replace all message formats
Profile Edit Screen NAME: K. Scott Morrison ADDRESS: 8999 Nelson Way CITY: Burnaby STATE/PROV: B.C. COUNTRY: Canada CODE: V5A 4B5 TEL: (604) 293-5753 FAX: (604) 473-5807 CREDIT CARD1 TYPE: VISA NUM: 123456789 EXP: 12/00 CREDIT CARD2 TYPE: AMEX NUM: 987654321 EXP: 04/01
Screen Scrape Remote System …….. .. … … … .. Screen Scrape Server Persistent Store
Binary Representation • Issues • Can’t determine structure from data • Portability • Fixed field length • Brittle interfaces • Must modify all clients and servers simultaneously • Mapping code typically buried in applications • Significant maintenance problem • Distribution of message map • Not human readable
US-ASCII Text Representation • Standards-based, reasonably portable • Human readable • Can make conjectures about semantics • Issues: • Limited character set • NLS problems • No real structure: hierarchy, lists, etc • Still very brittle • Distribution of message maps
Proprietary Tagging • Human understandable • Less brittle interface • Delimited text using tag and <CR> • Issues • Distribution of tag semantics • Non-standard • Character escaping issues • E.g. <CR>, :, etc • No sense of hierarchy • Programmer intensive: parsers, handlers, etc
Formalized Tagging: Markup • Markup is meta-data • Adds information about text • What it means, how to interpret, how to render, etc • Markup delimits • START-----------END • Interface is less brittle • Markup works as a container • Markup adds structure • Hierarchy, etc
Markup Can Be Stylistic Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <FONT FACE=“Times New Roman”> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad <B> minim </B> veniam, quis nostrud exerci tation <I> ullamcorper </I> suscipit lobortis nisl ut aliquip ex ea <U> commodo </U> consequat. </FONT>
Markup Can Be Structural Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <P> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. </P> <P> Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. </P>
Markup Can Be Semantic Lorem ipsum dolor sit amet. consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <TITLE> Lorem ipsum dolor sit amet. </TITLE> <BODY> consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. </BODY>
Simple Markup Example #1 <Message> Hello, World! </Message>
Simple Markup Example #2 <MessageContainer> <Message> Hello, World! </Message> <Message> Goodbye, World! </Message> </MessageContainer>
Profile Using Markup <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <City> Burnaby </City> <StateProvince> BC </StateProvince> <Country> Canada </Country> <ZipPostalCode> V5A 1B5 </ZipPostalCode> <Telephone> (604) 293-5753 </Telephone> <FAX> (604) 473-5807 </FAX>
Profile Using Markup (cont.) <Card> < Type > VISA </ Type > <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <Card> <Type> AMEX </ Type > <Number> 987654321 </Number> <Expiry> 0401 </Expiry> </Card> </Profile>
eXtensible Markup Language • Extensible • Tag set is not fixed • Structural • Deep, hierarchical nesting of structures • Ordered lists (unordered with Schema) • Can infer meaning from structure • Valid document requirement • Can check structure against a schema • Well formed (DTDs and Schema)
eXtensible Markup Language • Portable • Text-based Unicode • All parsers must support UTF-8 (US-ASCII) • Parsers may support: UTF-16, EBCDIC, UCS-4, ASCII, ISO 646, ISO 8859, Shift-JIS, EUC, etc • Human readable • Machine understandable • W3C standard • Rich set of emerging tools
Isn’t This Just Like HTML? • HTML is a markup language based on SGML • Key differences • HTML has a fixed set of tags • HTML mixes stylistic, structural, and semantic tags • HTML does not support deep nesting and hierarchy • HTML is invalid
XML Issues • XML documents must be valid Invalid • Means that structure can be inferred <Message> Hello, World! Valid <Message> Hello, World! </Message> Hello, World! </Message> <Message> Hello, World! </MESSAGE>
XML Issues • Knowledge of document organization • Distribution of document organization • Character set issues • Document focus • Parser speed and complexity
Message Organization: DTDs • Document Type Definition • Used to validate documents • Issues: • Doesn’t use XML syntax, not extensible • Writing good DTDs is hard • Very awkward and limited language constructs, uses eBNF grammar • No namespaces • No inheritance, defaults, ranges, enums
Profile DTD Example <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE Profile [ <!ELEMENT Profile (Name, Address, City, StateProvince, Country, ZipPostalCode, Telephone, FAX, Card*)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Address (#PCDATA)> <!-- Lines removed for clarity --> <!ELEMENT Card (Type, Number, Expiry)> <!ELEMENT Type (#PCDATA)> <!ELEMENT Number (#PCDATA)> <!ELEMENT Expiry (#PCDATA)> ]> <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <!-- Lines removed for clarity --> <Card> <Type> VISA </Type> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <!-- Lines removed for clarity --> </Profile>
Message Organization: Schema • Alternative to DTDs • XML syntax • Much more expressive • Has namespace support • Has inheritance, defaults, ranges (min/max), enumerations, sequences, unordered lists
Transforms: XSL <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <City> Burnaby </City> <StateProvince> BC </StateProvince> <Country> Canada </Country> <ZipPostalCode> V5A 1B5 </ZipPostalCode> <Telephone> (604) 293-5753 </Telephone> <FAX> (604) 473-5807 </FAX><Card> <Name> VISA </Name> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <Card> <Name> AMEX </Name> <Number> 987654321 </Number> <Expiry> 0401 </Expiry> </Card> </Profile> <VisaCard> <CardNumber> 123456789 </CardNumber> <Expiry> 1200 </Expiry> <Client> <Name> K. Scott Morrison </Name> <Telephone> (604) 293-5753 </Telephone> </Client> </VisaCard> Source Document Destination Document XSL Engine
eXtensible Stylesheet Language • XSL=XSL Transforms (XSLT) + Formatting Objects and Properties • Some basic XSLT functions: • Insertion of static text (like templates) • Copy, discard, or rearrange source text • Compute new text from source
XSLT Example: XML to HTML <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <html> <body> <P> Address is: <xsl:value-of select="Profile/Address"/> </P> <P> Name is: <xsl:value-of select="Profile/Name"/> </P> </body> </html> </xsl:template> </xsl:stylesheet>
XML as Data <Profile> <ID> 123456789 </ID> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <Card> <Name> VISA </Name> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> </Profile> XML Document XML-Db Extender Profile Table Database Card Table
XML as Data: Messaging Request XML Message <…> <…> … </…></…> <…> <…> … </…></…> RDBMS Response XML Message Server System Client System • Transport Examples: • HTTP over sockets • MQSeries • etc • Message Infrastructure: • ebXML • SOAP • etc • Message Formats: • OTA • TravelFrame • etc
Schema Distribution Source System Destination System DTD <…> <…> … </…></…> Embedded XML Message <…> <…> … </…></…> Replicated DTD <…> <…> … </…></…> Centralized Schema Repository
Thick PC Clients AS400s SNA TCP/IP Gateway S390 S390 UNIX Servers Applications: Interoperability
AS400s S390 S390 Applications: Interoperability Thick PC Clients MQSeries Integrator routing and transforming XML messages UNIX Servers
AS400s SNA TCP/IP S390 S390 Applications: eBusiness Internet Web/WML/XML Servers
Summary • XML is here to stay • There will be heavy vendor support for it because: 1. XML is standardized 2. XML will be the B2B message format • It can be leveraged now in most domains