1 / 13

Extensible Markup Language: XML

Extensible Markup Language: XML. HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol for describing data XML is quickly becoming standard for data exchange between applications. XML Documents.

tuckera
Download Presentation

Extensible Markup Language: XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extensible Markup Language: XML • HTML: portable, widely supported protocol for describing how to format data • XML: portable, widely supported protocol for describing data • XML is quickly becoming standard for data exchange between applications

  2. XML Documents • XML marks up data using tags, which are names enclosed in angle brackets < > • All tags appear in pairs: <myTag> .. </myTag> • Elements: units of data (i.e., anything between a start tag and its corresponding end tag) • Root element contains all other document elements • Tag pairs cannot appear interleaved: <a><b></a></b> Must be: <a><b></b></a> • Nested elements form trees What defines an XML document is not its tag names but that it has tags that are formatted in this way.

  3. Optional XML declaration includes version information parameter (MUST be very first line of file) Root element contains all other document elements article Because of the nice<tag>.. </tag>structure, the data can be viewed as organized in a tree: title date author summary content firstName lastName

  4. <?xml version = "1.0"?> <!– I-sequence structured with XML. --> <SEQUENCEDATA> <TYPE>dna</TYPE> <SEQ> <NAME>Aspergillus awamori</NAME> <ID>U03518</ID> <DATA>aacctgcggaaggatcattaccgagtgcgggtcctttgggccca acctcccatccgtgtctattgtaccctgttgcttcgg cgggcccgccgcttgtcggccgccgggggggcgcctctg ccccccgggcccgtgcccgccggagaccccaacacgaac actgtctgaaagcgtgcagtctgagttgattgaatgcaat cagttaaaactttcaacaatggatctcttggttccggc </DATA> </SEQ> </SEQUENCEDATA> An I-sequence might be structured as XML like this.. comment SEQUENCEDATA SEQ TYPE NAME ID DATA

  5. Parsing and displaying XML • XML is just another data format • We need to write yet another parser • No more filters, please! ? • No! XML is becoming standard • Many different systems can read XML – not many systems can read our I-sequence format.. • Thus, parsers exist already

  6. XML document opened in Internet Explorer Standard browsers can format XML documents nicely! Minus sign Each parent element/node can be expanded and collapsed Plus sign

  7. XML document opened in Mozilla Again: Each parent element/node can be expanded and collapsed (here by pressing the minus, not the element)

  8. Attributes Data can also be placed in attributes: name/value pairs Attribute (name-value pair, value in quotes): elementcontacthas the attributetypewhich has the value“to” Empty elements are elements with no character data between the tags. The tags of an empty element may be written in one like this:<myTag /> letter.xml

  9. Parsers and trees • We’ve already seen that XML markup can be displayed as a tree • Some XML parsers exploit this. They • parse the file • extract the data • return it organized in a tree data structure called a Document Object Model article title date author summary content firstName lastName

  10. Document Object Model (DOM) • a DOM parser retrieves data from XML document • return tree structure called a DOM tree • Each component of an XML document represented as a tree node • Parent nodes contain child nodes • Sibling nodes have same parent • Single root (or document) node contains all other document nodes

  11. Python provides a DOM parser! • All nodes have name (of tag) and value • Text (including whitespace) represented in nodes with tag name #text #text #text Simple XML title #text #text Dec..2001 date #text #text John #text firstName article author #text #text Doe #text lastName #text XML..easy. summary #text #text #text In this..XML. content #text

  12. NB: Changes since book! Parse XML document and load data into variable document documentElementattribute refers to root node fig16_04revised.py nodeNamerefers to element’s tagname Various node attributes: firstChild nextSibling nodeValue parentNode

  13. Program output Here is the root element of the document: article The following are its child elements: #text title #text date #text author #text summary #text content #text The first child of root element is: #text whose next sibling is: title Text inside "title" tag is Simple XML Parent node of title is: article #text #text Simple XML title #text #text Dec..2001 date #text #text John #text firstName article author #text #text Doe #text lastName #text XML..easy. summary #text #text #text In this..XML. content #text

More Related