1 / 70

Java and XML Platform independence meets language independence!

Java and XML Platform independence meets language independence!. CC432 / Short Course 507 Lecturer: Simon Lucas University of Essex Spring 2002. Main Topics. Introduction Reading and Writing XML SAX DOM and JDOM Serializing Objects to XML XMLC Concluding remarks. Introduction.

tara-mcleod
Download Presentation

Java and XML Platform independence meets language independence!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Java and XMLPlatform independence meets language independence! CC432 / Short Course 507 Lecturer: Simon Lucas University of Essex Spring 2002

  2. Main Topics • Introduction • Reading and Writing XML • SAX • DOM and JDOM • Serializing Objects to XML • XMLC • Concluding remarks

  3. Introduction • Java is a platform independent language – runs anywhere where we have a JVM • And is well-connected – powerful java.net library • Yet – many people persist in using other languages – C/C++, VB etc!

  4. Why Java and XML? • The common format that allows applications written in any language to communicate is XML • Therefore, very important to make Java read and write XML • Can also design object models in Java – and translate them into XML • Leverage powerful design tools such as Together for this purpose

  5. Reading and Writing XML • To gain an insight into what this involves – we’ll work through a simplified model of XML • Our simplified model is as follows: • A tree of elements • Each element either has: • Text, • OR • A set of Child Elements

  6. Element.java • The Element class defines the object model for this kind of document • It also includes some String constants that dictate what characters will be used to delimit the elements • These are chosen to be standard XML characters • Currently, no checking that node text does not contain these special characters!!!

  7. Element.java - I package xml.serial; import java.util.*; import java.io.*; public class Element { static String TAG_OPEN = "<"; static String TAG_CLOSE = ">"; static String END_TAG_OPEN = "</"; static int TAB = 2; static int INIT_INDENT = 0; static char SPACE = ' '; protected Vector children; protected StringBuffer text; protected String name; public Element( String name ) { this.name = name; children = null; text = null; }

  8. Element.java II final public Vector getChildren() { return children; } final public String getText() { return text.toString(); } final public String getName() { return name; }

  9. Element.java III final public void setText(String text) throws Exception { // should substitute for any nasty characters // e.g. at least < and > if ( children == null) { this.text = new StringBuffer( text ); } else { throw new Exception( "Cannot add text to a node that already has child elements"); } }

  10. Element.java IV final public void addChild(Element child) throws Exception { if ( text == null ) { if (children == null) { children = new Vector(); } children.addElement( child ); } else { throw new Exception( "Cannot add elements to a node that already has text"); } }

  11. Reading and Writing Elements • Given this simple Element class • We can now write code to serialize a tree of these elements to an XML doc • And to de-serialize such a document back to the tree of Elements in memory • Hence, we get to write a simple parser for this subset of XML! • ElementTest creates an element-only document and writes it to a file

  12. ElementTest.java package xml.serial; import java.io.*; public class ElementTest { public static void main(String[] args) throws Exception { Element el = new Element("object"); PrintWriter pw = new PrintWriter( System.out ); // el.write( pw ); Element value = new Element( "value" ); value.setText( "Hello" ); el.addChild( value ); el.write( pw ); pw.println( "And now the static version..." ); ElementWriter.write( el , pw ); pw.flush(); } }

  13. Running ElementTest >java xml.serial.ElementTest <object> <value> Hello </value> </object>

  14. SAX Event-based XML processing

  15. SAX – Main Features • Serial processing of an XML document • Register an event handler • The SAX parser then reads the XML document from start to end • Calls the methods of the event handler in response to various parts of the document

  16. Example Events • startDocument() • startElement() • characters() • endElement() • endDocument() • + many others!

  17. SAX-based program pattern • Define a class that implements the ContentHandler interface • Easiest way is to extend DefaultHandler • DefaultHandler provides NO-OP implementations of all the methods in the ContentHandler interface • Override whichever methods you need to for your application

  18. Using your Custom ContentHandler • Import the necessary packages • Create a new SAXParser • Get an XMLReader from the Parser • Set the ContentHandler for the XMLReader to be your own Customized ContentHandler • Set up an ErrorHandler for the XMLReader – this is a class to handle any parsing errors • Call the XMLReader to parse an XML Document

  19. Counting Node Types • This program is the Hello World of SAX • At the start of the document we create a Hashtable to count the occurrences of each type of element • We override startElement() to update the count in the Hashtable with each element name we see • Override endDocument() to print a summary

  20. SAXTest Program Structure • SAXTest uses CountNodes • CountNode extends DefaultHandler DefaultHandler SAXTest CountNodes

  21. SAXTest package courses.xml; import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; public class SAXTest extends DefaultHandler { static String parserClass = "org.apache.xerces.parsers.SAXParser"; public static void main(String[] args) throws Exception { XMLReader reader = XMLReaderFactory.createXMLReader( parserClass ); reader.setContentHandler( new CountNodes() ); reader.setErrorHandler( new SimpleErrorHandler(System.err)); reader.parse( args[0] ); } }

  22. CountNodes • We shall override the following: • startDocument() • startElement() • endElement()

  23. CountNodes - declaration package courses.xml; import org.xml.sax.*; import org.xml.sax.helpers.*; import java.util.*; public class CountNodes extends DefaultHandler { private Hashtable tags; // …

  24. CountNodes: startDocument() • Create a new hashtable for each new document public void startDocument() throws SAXException { tags = new Hashtable(); }

  25. CountNodes: startElement() public void startElement(String namespaceURI, String localName, String rawName, Attributes atts) throws SAXException { String key = localName; Object value = tags.get(key); if (value == null) { // Add a new entry tags.put(key, new Integer(1)); } else { // Get the current count and increment it int count = ((Integer)value).intValue(); count++; tags.put(key, new Integer(count)); } }

  26. CountNodes: endDocument() • Summarise the Hashtable contents public void endDocument() throws SAXException { Enumeration e = tags.keys(); while (e.hasMoreElements()) { String tag = (String)e.nextElement(); int count = ((Integer) tags.get(tag)).intValue(); System.out.println( "Tag <" + tag + "> occurs " + count + " times"); } }

  27. Running SAXTest: Hello.xml <?xml version="1.0" ?> <greetings> <greeting lang="english"> hello </greeting> <greeing> bonjour </greeing> <greeting> hola! </greeting> </greetings>

  28. Output >java courses.xml.SAXTest courses\xml\hello.xml Tag <greeing> occurs 1 times Tag <greetings> occurs 1 times Tag <greeting> occurs 2 times

  29. Notes on CountNodes • Note the parameters to startElement() • We get direct access to that element only – that is its: • Namespace • Attributes • Element Name (local name) • Raw Name (namespace + local name) • We must work for any access beyond this!

  30. SAX Exercise • By overriding: • startElement() • endElement() • startDocument() • endDocument() • provide a ContentHandler prints out how many times a greeting element was that child of another greeting element

  31. SAX Filter Pipelines • In the Count Nodes example, the XMLReader read from an XML document source • Also possible to read from the output of a ContentHandler • In this way can plug together modular filters to achieve complex effects

  32. DOM and JDOM Document Object Model and Java Document Object Model

  33. DOM • A language-independent object model of XML documents • Memory-based • The entire document is parsed – read in to memory • This allows direct access to any part of the document • But limits the size of document that can be handled

  34. JDOM • Because DOM is a language-independent spec., there are features that seem awkward from a Java perspective • JDOM is a Java-based system, developed by Brett McLaughlin and Jason Hunter • It aims to offer most of the features of DOM, but make them easier to exploit to Java programmers

  35. Hello JDOM World • We’ll look at a program that • creates a document • adds a few elements to it • writes it to an output stream

  36. package xml.jdom;import org.jdom.Element;import org.jdom.Document;import org.jdom.output.XMLOutputter;public class HelloWorld { public static void main(String[] args) throws Exception { Element root = new Element("Greeting"); root.setText("Hello world!"); Element child = new Element("Gday"); child.setText("The kid <bold> is \"cool </bold>"); child.addAttribute( "color" , "red" ); root.addContent( child ); Document doc = new Document(root);

  37. XMLOutputter output = new XMLOutputter( " " , true ); output.output( doc, new java.io.PrintWriter( System.out ) ); String text = root.getText(); }}

  38. Reading XML into JDOM package xml.jdom; import org.jdom.Document; import org.jdom.DocType; import org.jdom.Element; import org.jdom.input.SAXBuilder; import org.jdom.output.XMLOutputter; public class InputTest { public static void main(String[] args) throws Exception { String filename1 = "xml/slides/slides.xml"; SAXBuilder builder = new SAXBuilder(); System.out.println("Building..."); Document doc = builder.build( filename1 ); System.out.println( doc ); } }

  39. Processing XML with JDOM • Now we have the document tree in memory • Processing is typically much simpler than with SAX • Though for simple programs, this is not always so • Let’s begin by considering how to write the Count Nodes program with JDOM

  40. Some API • Commonly used functions: • getChildren() – gets all the child elements • getContent() – gets all the content of a node – Pis, Entities, Child elements etc • addContent() – adds any kind of content to a node • addChild() • get/setText() deals with the text of a node • getParent() – does what you expect!

  41. Count Nodes in JDOM • Strategy: • Create a hashtable • Read in the document • Walk the tree, keeping count in the hashtable • We walk the tree by recursively visiting all the children of a node

  42. CountNodes - Structure • CountTest.java reads in the XML doc as a JDOM Document • Creates an instance of CountNodes • Calls the walkTree method of CountNodes on the document root element • CountNodes defines three methods • Constructor – initialises the Hashtable • walkTree – recursively walks the document • count – updates entries in the Hastable • printSummary • Compare this with the SAX implementation

  43. CountTest.java package xml.jdom; import org.jdom.*; import org.jdom.input.SAXBuilder; public class CountTest { public static void main(String[] args) throws Exception { String filename1 = "courses/xml/hello.xml"; SAXBuilder builder = new SAXBuilder(); Document doc = builder.build( filename1 ); CountNodes counter = new CountNodes(); counter.walkTree( doc.getRootElement() ); counter.printSummary( System.out ); } }

  44. CountNodes.java package xml.jdom; import java.util.*; import java.io.*; import org.jdom.*; public class CountNodes { Hashtable h; public CountNodes() { h = new Hashtable(); } // … continued

  45. CountNodes – walkTree() public void walkTree(Element el) { count( el.getName() ); List children = el.getChildren(); for (Iterator i = children.iterator(); i.hasNext() ; ) { walkTree( (Element) i.next() ); } }

  46. CountNodes – count() public void count(String key) { Object value = h.get(key); if (value == null) { // Add a new entry h.put(key, new Integer(1)); } else { // Get the current count and increment it int count = ((Integer) value).intValue(); count++; h.put(key, new Integer(count)); } }

  47. CountNodes – printSummary() public void count(String key) { Object value = h.get(key); if (value == null) { // Add a new entry h.put(key, new Integer(1)); } else { // Get the current count and increment it int count = ((Integer) value).intValue(); count++; h.put(key, new Integer(count)); } }

  48. JDOM Exercise • Write a JDOM program to print out how many times a greeting element was that child of another greeting element • (e.g. given a doc like Hello.xml – see above) • (same task that we previously attempted with SAX)

  49. JDOM Exercise Hints • Consider the following methods: • getParent() • getName() • getChildren()

  50. Serializing Objects to XML Homebrew version JSX

More Related