1 / 28

XML

XML. XML Overview. Extensible Mark-up Language (XML) is a meta-language that describes the content of the document (self-describing data) Java = Portable Programs XML = Portable Data XML does not specify the tag set or grammar of the language

todd-roy
Download Presentation

XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Enterprise Applications CE00465-M

  2. XML Overview • Extensible Mark-up Language (XML) is a meta-language that describes the content of the document (self-describing data) Java = Portable Programs XML = Portable Data • XML does not specify the tag set or grammar of the language • Tag Set: mark-up tags that have meaning to a language processor • Grammar: defines correct usage of a language’s tags Enterprise Applications CE00465-M

  3. Applications of XML • Configuration files • Used extensively in J2EE architectures • Media for data interchange • A better alternative to proprietary data formats • Business-to-business (B2B) transactions on the Web • Electronic business orders (ebXML) • Financial Exchange (IFX) • Messaging exchange (SOAP) Enterprise Applications CE00465-M

  4. XML versus HTML • XML fundamentally separates content (data and language) from presentation; HTML allows a mix of content and presentation • HTML explicitly defines a set of legal tags as well as the grammar (intended meaning) • <TABLE>…<TABLE> • XML allows any tags or grammar to be used hence eXtensible • <STUDENT>…</STUDENT> • Note: Both are based on Standard Generalized Markup Language (SGML) Enterprise Applications CE00465-M

  5. Simple XML Example Student.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M

  6. Parsers & well-formed XML Documents • A software program called an XML parser (or an XML processor) is required to process an XML document • Parser reads the documents, checks its syntax and reports any errors • XML documents are well formed if they are syntactically correct • XML documents requires a single root element, start and end tag for each element, properly nested tags and attribute values in quotes • Parsers can support either the Document Object Model (DOM) and/or Simple API for XML (SAX) Enterprise Applications CE00465-M

  7. XML Components • Prolog • Defines the xml version, entity definitions and DOCTYPE • Components of the document • Tags and attributes • CDATA (character data) • Entities • Processing instructions • Comments Enterprise Applications CE00465-M

  8. XML prolog • XML Files always start with a prolog <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE XmlStudents SYSTEM “Students.dtd"> • The version of XML is required • The encoding identifies the character set • The prolog can contain entities and DTD definitions Enterprise Applications CE00465-M

  9. Document Type Definitions (DTDs) • Introduced into an XML document by !DOCTYPE statement • DTDs define an XML document’s structure (i.e. what elements are allowed in the document) • An XML document doesn’t have to have a DTD • But are often recommended to ensure document conformity • especially in B2B transactions • Parsers are classified as validating or non-validating. • Validating parsers check the XML document against the DTD • If the XML document conforms to the DTD it is said to be valid • If it fails to conform to the DTD but is syntactically correct it is said to be well formed • A valid document must also be well formed Enterprise Applications CE00465-M

  10. Example of a DTD Student.dtd <?xml version="1.0" encoding="UTF-8"?> <!ELEMENT XmlStudents (XmlStudent+)> <!ELEMENT XmlStudent (img,StdNo,FirstName,LastName,Phone, StartDate,Award,Faculty)> <!ELEMENT img (#PCDATA)> <!ELEMENT StdNo (#PCDATA)> <!ELEMENT FirstName (#PCDATA)> <!ELEMENT LastName (#PCDATA)> <!ELEMENT Phone (#PCDATA)> <!ELEMENT StartDate (#PCDATA)> <!ELEMENT Award (#PCDATA)> <!ELEMENT Faculty (#PCDATA)> Enterprise Applications CE00465-M

  11. Defining Elements • <!ELEMENT name definition/type> <!ELEMENT XmlStudents (XmlStudent+)> <!ELEMENT XmlStudent (img,StdNo,FirstName,LastName,Phone,StartDate, Award,Faculty)> <!ELEMENT img (#PCDATA)> • Types • EMPTY Element cannot contain any text or child elements • #PCDATA Only character data permitted • ANY Any well-formed XML data • List of legal child elements (no character data) • May contain character data and/or child elements (cannot define order and number of child elements) Enterprise Applications CE00465-M

  12. Defining Elements • Cardinality • [none] Default (one and only one instance) • ? 0,1 • * 0,1,…..,N • + 1,2,…..,N • List Operators • , Sequence (in order) • | Choice (one of several) Enterprise Applications CE00465-M

  13. Grouping Elements • Set of elements can be grouped within parentheses • (Elem1?, Elem2?)+ • Elem1 can occur 0 or 1 times followed by 0 or 1 occurrences of Elem2 • The group (sequence) must occur 1 or more times • OR • ((Elem1, Elem2) | Elem3)* • Either the group of Elem1, Elem2 is present (in order) or Elem3 is present, 0 or more times Enterprise Applications CE00465-M

  14. Element Example <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Person[ <!ELEMENT Person ( (Mr|Ms|Miss)?, FirstName, MiddleName*, LastName,(Jr|Sr)? )> <!ENTITY FirstName (#PCDATA)> <!ENTITY MiddleName (#PCDATA)> <!ENTITY LastName (#PCDATA)> <!ENTITY Mr EMPTY> <!ENTITY Ms EMPTY> <!ENTITY Jr EMPTY> <!ENTITY Sr EMPTY> ]> <Person> <Mr/> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> </Person> Note: The DTD has been embedded in the DOCTYPE, rather than being in a separate file Enterprise Applications CE00465-M

  15. Defining Attributes • <!ATTLIST element attrName type modifier> • Example <!ELEMENT Customer (#PCDATA)> <!ATTLIST Customer id CDATA #IMPLIED> <!ELEMENT Product (#PCDATA)> <!ATTLIST Product cost CDATA #FIXED “200” id CDATA #REQUIRED> Enterprise Applications CE00465-M

  16. Attribute Types • CDATA • Essentially anything; simply unparsed data <!ATTLIST Customer id CDATA #IMPLIED> • Enumeration • Attribute(value1|value2|value3)[Modifier] • Eight other attribute types • ID,IDREF,NMTOKEN,NMTOKENS,ENTITY,ENTITIES,NOTATION Enterprise Applications CE00465-M

  17. Attribute Modifiers • #IMPLIED • Attribute is not required <!ATTLIST Customer id CDATA #IMPLIED> • #REQUIRED • Attribute must be present <!ATTLIST Customer id CDATA #required> • #FIXED “value” • Attribute is present and always has this value <!ATTLIST Customer id CDATA #FIXED “EN”> • Default value (applies to enumeration) <!ATTLIST car colour (red | white | blue) “white”> Enterprise Applications CE00465-M

  18. XML DOCTYPE • Document Type Declarations • Specifies the location of the DTD defining the syntax and structure of elements in the document • Common forms • <!DOCTYPE root [DTD]> • <!DOCTYPE root SYSTEM URL> • <!DOCTYPE root PUBLIC FPI-identifier URL]> • The root identifies the starting element (root element) of the document • The DTD can be external to the XML document, referenced by a SYSTEM or PUBLIC URL • SYSTEM URL refers to a private DTD • Located on the local file system or HTTP server • PUBLIC URL refers to a DTD intended for public use Enterprise Applications CE00465-M

  19. XML DOCTYPE • Specifying a PUBLIC DTD <!DOCTYPE root PUBLIC FPI-identifier URL> • The Formal Public identifier (FPI) has four parts: • Connection of DTD to a formal standard • Group responsible for the DTD • Description and type of document • Language used in the DTD • Examples: <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” http://www.w3.org/TR/html4/strict.dtd> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> Enterprise Applications CE00465-M

  20. XML Comments • The same as HTML comments <!--This an XML and HTML comment--> Enterprise Applications CE00465-M

  21. Processing Instructions • Application-specific instruction to the XML processor • <?processor:instruction?> • Example <?xml version="1.0" encoding="UTF-8"?> <?xml:stylesheet type="text/xsl" href="list.xsl"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M

  22. XML Root Element • Required for XML-aware applications to recognise beginning and end of document • Example <?xml version="1.0" encoding="UTF-8"?> <?xml:stylesheet type = "text/xsl" href = "list.xsl"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M

  23. XML Tags • Tag names; • Case sensitive • Start with a letter or underscore • After first character, numbers, ‘-’ and ‘.’ are allowed • Cannot contain whitespaces • Avoid use of colon except for indicating namespaces • For a well-formed XML document • Every tag must have an end tag • <element1>…..</element1> • <element2/> • All tags must be nested (tag order can’t be mixed) • Tags can also have attributes <img src="harry.gif"></img> • Attributes provide metadata for the element • Every attribute value must be enclosed in ”” with no commas between attributes • Same naming convention as elements Enterprise Applications CE00465-M

  24. Document Entities • Entities refer to data item, typically text • General entity references start with & and end with ; • The entity reference is replaced by its true value when parsed • The characters < > & ’ “ require entity references to avoid conflicts with the XML application (parser) &lt; &gt; &amp; &apos; &quot; • Entities are user definable <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE university[ <!ELEMENT university (faculty)><!--needs a white space--> <!ENTITY myfaculty "Faculty of Computing Engineering and Technology"> ]> <university> <faculty>My Faculty is,&myfaculty;</faculty> </university> • Note we’ve included the DTD in the prolog of the above code Enterprise Applications CE00465-M

  25. Document Entities • Internet Explorer displays: Enterprise Applications CE00465-M

  26. CDATA Sections • CDATA (character data) is not parsed • Example <?xml version = "1.0"?> <book title = "C++ How to Program" edition = "3"> <sample> // C++ comment if ( this-&gt;getX() &lt; 5 &amp;&amp; value[ 0 ] != 3 ) cerr &lt;&lt; this-&gt;displayError(); </sample> <sample> <![CDATA[ // C++ comment if ( this->getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); ]]> </sample> C++ How to Program by Deitel &amp; Deitel </book> <!--source DEitel&DEitel--> Enterprise Applications CE00465-M

  27. CDATA Sections • Internet Explorer displays Enterprise Applications CE00465-M

  28. Extensible Stylesheet Language (XSL) • Consists of two parts • XSL Transformation language (XLST) • Used to transform an XML document from one form to another • XSL formatting objects • Provides an alternative to CSS for formatting and styling an XML document • More information • XML: http://www.w3schools.com/xml/default.asp • DTD: http://www.w3schools.com/dtd/default.asp • XSLT: http://www.w3schools.com/xsl/default.asp Enterprise Applications CE00465-M

More Related