600 likes | 799 Views
XML (eXtensible Markup Language). Learning Objectives. XML Introduction XML basics XML Document Type Definition XML Schema and XML Data Type XSL (eXtensible Style Language) Access XML via Scripting and XML Document Object Model (DOM) Server Side XML Processing XML Usage in B2B.
E N D
Learning Objectives XML Introduction XML basics XML Document Type Definition XML Schema and XML Data Type XSL (eXtensible Style Language) Access XML via Scripting and XML Document Object Model (DOM) Server Side XML Processing XML Usage in B2B
Reference Http://webdev.wrox.co.uk/xml/ Http://webdev.wrox.co.uk/books/1576/ http://www.nvcc-fastrack.com/xml/ http://medn.microsoft.com/xml/ http://www.xml.com/ http://www.xmlsoftware.com/
Reference http://www.w3.org/TR/1998/NOTE-XML-data/ http://www.w3.org/TR/REC-xml-names/ http://www.w3.org/TR/1998/WD-xsl-19981216.pdf http://www.w3.org/XML/Activity http://testuality.com/xml/ http://www.builder.com/Authoring/XmlSpot/
HTML,SGML and XML? • Relations between HTML, SGML and XML: • - HTML is an implementation (instance) of SGML • - XML is a subset of SGML • - They are all languages for “document” processing Instances / Domains RDF CDF CML … XML HTML … SGML
XML Activities by W3 • http://www.w3.org/XML/Activity Mathematical expressions Multi \media Document mark-up Scalable Vector Graphic Platform for Internet Content Selection Platform for Privacy Preferences Other RDF applications HTML MathML SVG SMIL PICS P3P RDF XML
Define Your Own Markup Languages Using XML • Channel Definition Format (CDF) • Mathematical Markup Language (MML) • Synchronized Multimedia Integration Language (SMIL) • Open Software Description (OSD) • Chemical Markup Language (CML)
XML’s Relation with HTML and SGML • What is HTML? HTML (HyperText Markup Language) is a specific application of SGML used in the World Wide Web. • SGML is the “mother tongue” used for describing thousands of different DOCUMENT TYPE DEFINITIONS (DTD) in many fields of human activity. HTML is just one of these document types. SGML DTD for HTML HTML Documents
XML’s Relation with HTML and SGML • XML is an abbreviated version of SGML, to make it easier for you to define your own document types, and to make it easier for programmers to write programs to handle them. • It omits the more complex and less-used parts of SGML in return for the benefits of being easier to write application, easier to understand, and more suited to delivery and interoperability over the Web. XML is more “SGML- -” rather the “HTML++”
XML’s Relation with HTML and SGML • What is SGML? • Characteristics of SGML • XML’s Relation with HTML and SGML
What is SGML? • SGML is an international standard for the definition of device-independent, system-independent methods of representing texts in electronic form. More exactly, SGML is a metalanguage, that is, a means of formally describing a markup language. • Characteristics of SGML - Descriptive Markup: A descriptive markup system uses markup codes which simply provide names to categorize parts of a document. - Types of Document: Documents are regarded as having types, just as other objects processed by computers do. The type of a document s formally defined by its constituent parts and their structure. - Data Independence
XML Sample hello.xml <?xml version=“1.0” standalone=“yes”?> <foo> Hello World! </foo> hello2.xml <?xml version=“1.0” standalone=“yes”?> <greeting> Hello World! </greeting>
History of XML • The eXtensible Markup Language (XML)has been developed by the World Wide Web Consortium (W3C)XML Working Group to bring SGML (Standard Generalized markup Language) to the Web. • SGML is a language for the specification of markup languages. SGML is parent of the well known HyperText Markup Language (HTML). • XML’s design was conducted by looking at the strengths and weaknesses of SGML but without all the complex and rarely used features. • The first working draft for XML was Published in November 1996.
In HTML you write both structure and display…. ….. <p><font size=“5”> <strong>Subject:</strong>Hello</font></p> <p><font size=“5”> <strong>Sender:</strong>Hello</font></p> <p><font size=“5”> <strong>Receiver:</strong>Hello</font></p> <hr> <p>Dear John,</p> <p>Oh how I hate to write in HTML….</p> <p>…….</p> <p>I want to XML now!!!!</p> <p></p> <p>Harry</p> <p><em>May 12, 1999</em></p>
In an XML Document, You Write Only Structure ……… <EMAIL> <SENDER>John</SENDER. <RECEIVER>Harry</RECEIVER> <SUBJECT>Hello World</SUBJECT> <BODY>Dear John……</BODY> <FOOTNOTE> <SIGNATURE>Harry</SIGNATURE> <TIME>May 12, 1999</TIME> </FOOTNOTE> </EMAIL> BUT…. It relies on special browsers and/or XSL for displaying it
So You Need More Than XML…. Structure DTD XML Parser Validation XML document Document with Special Format (HTML/CSS) Processing Display Content XSL
HTML vs. XML In HTML • Structure and display are all defined by tags. • Tags are predefined by W3C standard. In XML • Structure, content and display are all separated. • You can defined your own document structure and tags.
Why XML? • Why Not Extending HTML? • Why XML? • Benefits from XML
Why Not Extending HTML? • HTML is already over burdened with dozens of interesting but often incompatible inventions from different manufacturers, because it provides only one way of describing your information. • XML will allow groups of people or organizations to create their own customized markup languages for exchanging information in their domain. • A standard document schema needs to be defined: Specialized for each type of applications • Tools need to be developed for authoring: Generic or specialized. • Tools need to be developed for browsing: Generic or specialized.
Why XML? • It removes two constraints which are holding back Web developments: • Dependence on a single, inflexible document type (HTML); • The complexity of full SGML, whose syntax allows many powerful but hard-to-program options. XML simplifies the levels of optionality in SGML, and allows the development of user-defined document types on the Web.
Benefits from XML • Extensible Markup Language is a text-based format that lets developers describe, deliver and exchange structured data between a range of applications to clients for local display and manipulation. • Information will be more accessible and reusable. • XML brings so much power and flexibility to Web-based applications for exchanging “structured data”
In Summary: Rationales and the Evolution of XML • Unfortunately, there are things that HTML just can’t do for you. • Fortunately, HTML is growing quickly to meet these needs. • Unfortunately, no matter how many new tags are added, there will never be enough for all the good ideas people keep having. • Fortunately, HTML is a form of SGML (Standard Generalized Markup Language), an ISO standard that allows you to invent the tags you need, and declare them so others can use them. • Unfortunately, the SGML standard is large, takes time to learn, and doesn’t have a “starter kit”. • Fortunately, XML is here. Source: http://www.textuality.com/xml/
An Example of XML <Book> <TableOfContents>…</TableOfContents> <Chapter> <ChapterTitle>XML and why bother</ChapterTitle> <Para>…</Para> <Section> <Title>The syntax</Title> <Para>…</Para> </Section> … </Chapter> </Book>
Heirarchy Data Structure Book TableofContents Chapter ChapterTitle Paragraph Section Title Paragraph
Benefits from XML • XML suits different applications Internet software Editor Spreadsheet XML Database
Benefits from XML (Cont.) • More Meaningful searches • Development of flexible Web applications • Data integration from disparate sources • Data from multiple applications • Local computation and manipulation of data • Multiple views on the data Open Standards Format for Web delivery • Enhances scalability Facilities compression
XML Basics • From HTML to XML • Web after XML • Concepts of XML
From HTML to XML • XML for document • Using XML to create documents is very similar to creating HTML documents. The only difference is that XML provides a richer set of elements and is more extensible to various publishing media
XML: One Document, Many Different Outputs A software module called an XML Processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application. Sound Document XML Processor Database Document XML Document Printed Document Display Document
XML for Data Interchange • With XML, the time of import/export filters for thousands of different platform is gone. In most cases. XML can provide a single platform for interchange of data between applications.
XML for Data Interchange HTML View#1 (eg purchasing Agent) HTML View #2 (eg consumer) Display Multiple views created from the XML – based data XML delivered to other applications or objects for further processing. Data Delivery Manipulation XML exchanged over HTTP manipulated via the DOM Data Integration XML emitted or generated from multiple sources. Desktop ------------------ Middle Tier ------------------ Storage XML Web Server DB Access, Integration Business Rules (eg purchase order) Mainframe Database
XML for Data, HTML for Display • Client-side XML: XML data is sent down to the client (your browser), and the browser uses a stylesheet – extra information that helps your browser know how to translate XML to HTML – to provide you with HTML data for your client. • Server-side XML: The XML data is kept at the server side and is converted to HTML on the fly before it is sent to a browser.
The Design Goals for XML • XML shall be straightforwardly usable over the Internet. • XML shall support a wide variety of applications. • XML shall be compatible with SGML • It shall be easy to write programs which process XML documents. • The number of optional features in XML is to be kept to the absolute minimum, ideally zero. • XML documents should be human-legible and reasonably clear. • The XML design should be prepared quickly. • The design of XML shall be formal and concise. • XML documents shall be easy to create. • Terseness in XML markup is of minimal importance.
Rules for Well-Formed XML • Third: Double-quote value delimiters • All attribute values must be enclosed in single or double quotation marks. • Legal: <tag attribute = “value”> Or <tag attribute = ‘value’> • Illegal: <font size = 6> XML tags are case sensitive <myTAG> <Mytag> are different
Rules for Well-Formed XML • Fourth: Single tag elements. • Singleton tags (called empty element or tags without content) must be written in an abbreviated from using special XML syntax. • Legal: <BR/> <HR/> • <TITLE></TITLE> is equivalent to <TITLE/> • Illegal: <BR> <HR>
Source Document <PRODUCTLIST> <PRODUCT> <ID>html100</ID> <NAME>Introduction to HTML</NAME> <PRICE>$200</PRICE> </PRODUCT> <PRODUCT> <ID>xml100</ID> <NAME>Introduction to XML</NAMES> <PRICE>$250</PRICE> </PRODUCT> </PRODUCTLIST>
Elements Vs. Attributes: When to use them? • Elements can control structure of their content • Attributes have (limited) data typing control. • Some styling approaches (CSS) prefer elements • HTML uses elements for content, attributes for semantics - <a href= “http://….” >Hot words </a>
XML Document: Validation • You can enforce rules about these tags. Two kinds of ways to define rules for the document. These rules can be used to validate an XML document or used by XML authoring tolls to guide the creation of an XML document. • Document Type Definition (DTD): This is used to define a grammar for the tags and attributes. This sysntax is supported, but deprecated by Microsoft. It uses a special non-XML based grammar. • XML Schema (XML-Data): This is a much richer and more extensible way to describe the rules for the content of a document and uses XML itself as a grammar.
XML Document The XML document type declaration (DTD) contains or points to markup declarations that provide a grammar for a class of Documents.
Anatomy of a Tag • <H1 ALIGN=“CENTRE”> XML Tutorial </H1>
DTD (Document Type Definition • DTD defines the valid structure of XML document rules such as: • Valid element names <BOOK><TITLE><AUTHOR><PRICE><ISBN> • Valid Attribute names and values <AUTHOR id=“234” gender=“F”> • Relationship between elements <BOOK> <TITLE>…</TITLE> <AUTHOR>…</AUTHOR ………….. </BOOK> • DTD can be included within XML document or included from other file/document.
It’s Complex, but more Powerful • With DTD you can… - Define your own more meaningful tags easy to read easy to search easy to transform to other formats - You can validatewhether your document is structure correct important for business transaction document prevent from error or miscommunication • An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it. • The document type declaration must appear before the first element in the document.
Documents and Comments • Document • [1] document ::= prolog element Misc* • S (white space) consists of one or more space (#x20 | #x9 | #xD |#xA)+ • An example of a comment: • <!– declaration for <head> & <Body> -- >
Display XML Documents • XML specifies syntax, not semantics. XML tags have no predefined behavior or appearance that has to be supplied in operational terms by programs or scripts or in declarative terms by style sheets. • You will have to supply both the content of a document (expressed in XML) and its treatment, which you must specify either programmatically (with scripts) or declaratively (with style sheets). • Cascading Style Sheets (CSS), the style sheet language developed for HTML, can be used to apply styles to XML documents, but it doesn’t have the power to transform and generate structures (such as tables of contents) needed for XML-based publishing in general.
XSL • A CSS (Cascading Style Sheet) defineds the specification for an HTML document’s presentation and appearance. • Similarly, XSL (Extensible Style Language) defines the specification for an XML document’s presentation and appearance. • XSL is used to transform XML-based data into HTML or other presentation formats, • XSL is a subset of DSSSL (Document Style Semantics and Specification Language). DSSL is a style language used primarily with SGML. • XML is a subset of SGML. • XSL is the proposed style language for XML documents.
DTD for Booklist <!DOCTYPE PUBLIST[ <!ELEMENT PUBLIST (ITEM+) <!ELEMENT ITEM(CODE, CATEGORY, RELEASE_DATE,TITLE,AUTHORLIST?,SALES?)> <!ATTLIST ITEM ITEMTYPE (BOOK | ARTICLE | PERIODICAL) “BOOK”> <!ELEMENT CODE (#PCDATA)> <!ELEMENT CATEGORY (#PCDATA)> <!ELEMENT RELEASE_DATE (#PCDATA)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT AUTHORLIST (AUTHOR*)> <!ELEMENT SALES (#PCDATA)> <!ELEMENT AUTHOUR (FIRST_NAME, LAST_NAME)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT LAST_NAME(#PCDATA)> ]>
Myth • Myth: - XML can drive web browsers by itself • Answer: - XML can not drive the Web Browser by itself –at least not by itself. - IE 5.0 can display XML document in a predefined format - Style Language
Style Language • Contents of documents Style Language Standard: http://www.w3.org/TR/WD-xsl
Sample XSL and XML greeting.xml <?xml version=“1.0”standalone=“yes”?> <?xml-stylesheet href=“greeting.xsl” type=“text/xsl” ?> </greeting> Hello World! <greeting> greeting.xsl <xsl:stylesheet xmins:xsl=http://www.w3.org/TR/WD-xsl> <xsl:template match=“/”> <H1><xsl:value-of/></H1> </xsl:template> </xsl:stylesheet>