1 / 16

XML DOM Tutorial

XML DOM Tutorial. CSC 309 By: Meng Lou. DOM. Introduction Overview Steps for DOM parsing Examples DOM or SAX? Summary. Introduction. DOM supports navigating and modifying XML documents. Hierarchical tree representation of documents Language Neutral, C++, Java, CORBA

emiko
Download Presentation

XML DOM Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML DOM Tutorial CSC 309 By: Meng Lou

  2. DOM Introduction Overview Steps for DOM parsing Examples DOM or SAX? Summary

  3. Introduction • DOM supports navigating and modifying XML documents. • Hierarchical tree representation of documents • Language Neutral, C++, Java, CORBA • www.w3c.org/DOM

  4. Pros and Cons • Advantages: Robust API for the DOM TREE; Relatively simple to modify the data structure and extract data • Disadvantages: Stores the entire document in memory; As DOM was written for any language, method naming conventions don’t follow standard Java conventions

  5. Overview of steps

  6. Steps for parsing • Specify parser • Create a document builder • Invoke the parser to create a Document representing the XML document • Normalize • Obtain the root node • Modify and examine the properties of nodes

  7. Specifying a Parser • Use the command line java –D option • In the program, use System.setProperty, eg. System.setProperty( “javax.xml.parsers.DocumentBuilderFactory”, “org.apache.xerces.jaxp.DocumentBuilderFactoryImpl” );

  8. Create a Document Handler • Create an instance of builder factory, then use it to create a DocumentBuilder Object DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = builderFactory.newDocumentBuilder();

  9. Create a Dcoument • Call the parse method Document doc = builder.parse (someInputStream); • The Document class represents the parsed result in a tree structure

  10. Normalize the Tree • Normalization has two affects: - Combines textual nodes that span multiple lines - Eliminates empty textual nodes doc.getDocumentElement().normalize();

  11. Obtain the root node • Traversing begins at the root node Element rootElement = doc.getDocumentElement(); - Element is a subclass of the more general Node class represents an XML element - Node represents all the various components of an XML document eg. Document, Element, Attribute, Entity…

  12. Examine and Modify Nodes • Various properties: - getNodeName - getNodeType - getAttributes - getChildNodes - setNodeValue - appendChild - removeChild - replaceChild

  13. Sample Code Bits //walk the DOM tree and print as u go public void walk(Node node) { int type = node.getNodeType(); switch(type) { case Node.DOCUMENT_NODE: { System.out.println("<?xml version=\"1.0\" encoding=\""+ "UTF-8" + "\"?>"); break; }//end of document case Node.ELEMENT_NODE: { System.out.print('<' + node.getNodeName() ); NamedNodeMap nnm = node.getAttributes(); if(nnm != null ) { int len = nnm.getLength() ; Attr attr; for ( int i = 0; i < len; i++ ) { attr = (Attr)nnm.item(i); System.out.print(' ' + attr.getNodeName() + "=\"" + attr.getNodeValue() + '"' ); } } System.out.print('>'); break; }//end of element case Node.ENTITY_REFERENCE_NODE: { System.out.print('&' + node.getNodeName() + ';' ); break; }//end of entity case Node.CDATA_SECTION_NODE: { System.out.print( "<![CDATA[" + node.getNodeValue() + "]]>" ); break; } case Node.TEXT_NODE: { System.out.print(node.getNodeValue()); break; } }//end of switch //recurse for(Node child = node.getFirstChild(); child != null; child = child.getNextSibling()) { walk(child); } //without this the ending tags will miss if ( type == Node.ELEMENT_NODE ) { System.out.print("</" + node.getNodeName() + ">"); } }//end of walk

  14. DOM or SAX ? • Dom - Suitable for small documents - Easily modify document - Memory intensive • SAX (Simple API for XML) - Suitable for large documents - Only traverse document once - event Driven, saves memory

  15. Summary • DOM is a tree representation of an XML document in memory • JAXP provides a vendor-neutral interface to the underlying parser • Every component of the XML document is a Node • Use normalization to combine text elements that spans multiple lines

More Related