280 likes | 389 Views
XML. XML Overview. Extensible Mark-up Language (XML) is a meta-language that describes the content of the document (self-describing data) Java = Portable Programs XML = Portable Data XML does not specify the tag set or grammar of the language
E N D
XML Enterprise Applications CE00465-M
XML Overview • Extensible Mark-up Language (XML) is a meta-language that describes the content of the document (self-describing data) Java = Portable Programs XML = Portable Data • XML does not specify the tag set or grammar of the language • Tag Set: mark-up tags that have meaning to a language processor • Grammar: defines correct usage of a language’s tags Enterprise Applications CE00465-M
Applications of XML • Configuration files • Used extensively in J2EE architectures • Media for data interchange • A better alternative to proprietary data formats • Business-to-business (B2B) transactions on the Web • Electronic business orders (ebXML) • Financial Exchange (IFX) • Messaging exchange (SOAP) Enterprise Applications CE00465-M
XML versus HTML • XML fundamentally separates content (data and language) from presentation; HTML allows a mix of content and presentation • HTML explicitly defines a set of legal tags as well as the grammar (intended meaning) • <TABLE>…<TABLE> • XML allows any tags or grammar to be used hence eXtensible • <STUDENT>…</STUDENT> • Note: Both are based on Standard Generalized Markup Language (SGML) Enterprise Applications CE00465-M
Simple XML Example Student.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M
Parsers & well-formed XML Documents • A software program called an XML parser (or an XML processor) is required to process an XML document • Parser reads the documents, checks its syntax and reports any errors • XML documents are well formed if they are syntactically correct • XML documents requires a single root element, start and end tag for each element, properly nested tags and attribute values in quotes • Parsers can support either the Document Object Model (DOM) and/or Simple API for XML (SAX) Enterprise Applications CE00465-M
XML Components • Prolog • Defines the xml version, entity definitions and DOCTYPE • Components of the document • Tags and attributes • CDATA (character data) • Entities • Processing instructions • Comments Enterprise Applications CE00465-M
XML prolog • XML Files always start with a prolog <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE XmlStudents SYSTEM “Students.dtd"> • The version of XML is required • The encoding identifies the character set • The prolog can contain entities and DTD definitions Enterprise Applications CE00465-M
Document Type Definitions (DTDs) • Introduced into an XML document by !DOCTYPE statement • DTDs define an XML document’s structure (i.e. what elements are allowed in the document) • An XML document doesn’t have to have a DTD • But are often recommended to ensure document conformity • especially in B2B transactions • Parsers are classified as validating or non-validating. • Validating parsers check the XML document against the DTD • If the XML document conforms to the DTD it is said to be valid • If it fails to conform to the DTD but is syntactically correct it is said to be well formed • A valid document must also be well formed Enterprise Applications CE00465-M
Example of a DTD Student.dtd <?xml version="1.0" encoding="UTF-8"?> <!ELEMENT XmlStudents (XmlStudent+)> <!ELEMENT XmlStudent (img,StdNo,FirstName,LastName,Phone, StartDate,Award,Faculty)> <!ELEMENT img (#PCDATA)> <!ELEMENT StdNo (#PCDATA)> <!ELEMENT FirstName (#PCDATA)> <!ELEMENT LastName (#PCDATA)> <!ELEMENT Phone (#PCDATA)> <!ELEMENT StartDate (#PCDATA)> <!ELEMENT Award (#PCDATA)> <!ELEMENT Faculty (#PCDATA)> Enterprise Applications CE00465-M
Defining Elements • <!ELEMENT name definition/type> <!ELEMENT XmlStudents (XmlStudent+)> <!ELEMENT XmlStudent (img,StdNo,FirstName,LastName,Phone,StartDate, Award,Faculty)> <!ELEMENT img (#PCDATA)> • Types • EMPTY Element cannot contain any text or child elements • #PCDATA Only character data permitted • ANY Any well-formed XML data • List of legal child elements (no character data) • May contain character data and/or child elements (cannot define order and number of child elements) Enterprise Applications CE00465-M
Defining Elements • Cardinality • [none] Default (one and only one instance) • ? 0,1 • * 0,1,…..,N • + 1,2,…..,N • List Operators • , Sequence (in order) • | Choice (one of several) Enterprise Applications CE00465-M
Grouping Elements • Set of elements can be grouped within parentheses • (Elem1?, Elem2?)+ • Elem1 can occur 0 or 1 times followed by 0 or 1 occurrences of Elem2 • The group (sequence) must occur 1 or more times • OR • ((Elem1, Elem2) | Elem3)* • Either the group of Elem1, Elem2 is present (in order) or Elem3 is present, 0 or more times Enterprise Applications CE00465-M
Element Example <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Person[ <!ELEMENT Person ( (Mr|Ms|Miss)?, FirstName, MiddleName*, LastName,(Jr|Sr)? )> <!ENTITY FirstName (#PCDATA)> <!ENTITY MiddleName (#PCDATA)> <!ENTITY LastName (#PCDATA)> <!ENTITY Mr EMPTY> <!ENTITY Ms EMPTY> <!ENTITY Jr EMPTY> <!ENTITY Sr EMPTY> ]> <Person> <Mr/> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> </Person> Note: The DTD has been embedded in the DOCTYPE, rather than being in a separate file Enterprise Applications CE00465-M
Defining Attributes • <!ATTLIST element attrName type modifier> • Example <!ELEMENT Customer (#PCDATA)> <!ATTLIST Customer id CDATA #IMPLIED> <!ELEMENT Product (#PCDATA)> <!ATTLIST Product cost CDATA #FIXED “200” id CDATA #REQUIRED> Enterprise Applications CE00465-M
Attribute Types • CDATA • Essentially anything; simply unparsed data <!ATTLIST Customer id CDATA #IMPLIED> • Enumeration • Attribute(value1|value2|value3)[Modifier] • Eight other attribute types • ID,IDREF,NMTOKEN,NMTOKENS,ENTITY,ENTITIES,NOTATION Enterprise Applications CE00465-M
Attribute Modifiers • #IMPLIED • Attribute is not required <!ATTLIST Customer id CDATA #IMPLIED> • #REQUIRED • Attribute must be present <!ATTLIST Customer id CDATA #required> • #FIXED “value” • Attribute is present and always has this value <!ATTLIST Customer id CDATA #FIXED “EN”> • Default value (applies to enumeration) <!ATTLIST car colour (red | white | blue) “white”> Enterprise Applications CE00465-M
XML DOCTYPE • Document Type Declarations • Specifies the location of the DTD defining the syntax and structure of elements in the document • Common forms • <!DOCTYPE root [DTD]> • <!DOCTYPE root SYSTEM URL> • <!DOCTYPE root PUBLIC FPI-identifier URL]> • The root identifies the starting element (root element) of the document • The DTD can be external to the XML document, referenced by a SYSTEM or PUBLIC URL • SYSTEM URL refers to a private DTD • Located on the local file system or HTTP server • PUBLIC URL refers to a DTD intended for public use Enterprise Applications CE00465-M
XML DOCTYPE • Specifying a PUBLIC DTD <!DOCTYPE root PUBLIC FPI-identifier URL> • The Formal Public identifier (FPI) has four parts: • Connection of DTD to a formal standard • Group responsible for the DTD • Description and type of document • Language used in the DTD • Examples: <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” http://www.w3.org/TR/html4/strict.dtd> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> Enterprise Applications CE00465-M
XML Comments • The same as HTML comments <!--This an XML and HTML comment--> Enterprise Applications CE00465-M
Processing Instructions • Application-specific instruction to the XML processor • <?processor:instruction?> • Example <?xml version="1.0" encoding="UTF-8"?> <?xml:stylesheet type="text/xsl" href="list.xsl"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M
XML Root Element • Required for XML-aware applications to recognise beginning and end of document • Example <?xml version="1.0" encoding="UTF-8"?> <?xml:stylesheet type = "text/xsl" href = "list.xsl"?> <!DOCTYPE XmlStudents SYSTEM "Student.dtd"> <XmlStudents> <XmlStudent> <img src="harry.gif"></img> <StdNo>2</StdNo> <FirstName>Harry</FirstName> <LastName>Nelson</LastName> <Phone>250</Phone> <StartDate>2005-09-28</StartDate> <Award>Computer Science</Award> <Faculty>VP</Faculty> </XmlStudent> </XmlStudents> Enterprise Applications CE00465-M
XML Tags • Tag names; • Case sensitive • Start with a letter or underscore • After first character, numbers, ‘-’ and ‘.’ are allowed • Cannot contain whitespaces • Avoid use of colon except for indicating namespaces • For a well-formed XML document • Every tag must have an end tag • <element1>…..</element1> • <element2/> • All tags must be nested (tag order can’t be mixed) • Tags can also have attributes <img src="harry.gif"></img> • Attributes provide metadata for the element • Every attribute value must be enclosed in ”” with no commas between attributes • Same naming convention as elements Enterprise Applications CE00465-M
Document Entities • Entities refer to data item, typically text • General entity references start with & and end with ; • The entity reference is replaced by its true value when parsed • The characters < > & ’ “ require entity references to avoid conflicts with the XML application (parser) < > & ' " • Entities are user definable <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE university[ <!ELEMENT university (faculty)><!--needs a white space--> <!ENTITY myfaculty "Faculty of Computing Engineering and Technology"> ]> <university> <faculty>My Faculty is,&myfaculty;</faculty> </university> • Note we’ve included the DTD in the prolog of the above code Enterprise Applications CE00465-M
Document Entities • Internet Explorer displays: Enterprise Applications CE00465-M
CDATA Sections • CDATA (character data) is not parsed • Example <?xml version = "1.0"?> <book title = "C++ How to Program" edition = "3"> <sample> // C++ comment if ( this->getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); </sample> <sample> <![CDATA[ // C++ comment if ( this->getX() < 5 && value[ 0 ] != 3 ) cerr << this->displayError(); ]]> </sample> C++ How to Program by Deitel & Deitel </book> <!--source DEitel&DEitel--> Enterprise Applications CE00465-M
CDATA Sections • Internet Explorer displays Enterprise Applications CE00465-M
Extensible Stylesheet Language (XSL) • Consists of two parts • XSL Transformation language (XLST) • Used to transform an XML document from one form to another • XSL formatting objects • Provides an alternative to CSS for formatting and styling an XML document • More information • XML: http://www.w3schools.com/xml/default.asp • DTD: http://www.w3schools.com/dtd/default.asp • XSLT: http://www.w3schools.com/xsl/default.asp Enterprise Applications CE00465-M