510 likes | 621 Views
95-733 Week 5. Basic SAX Example From Chapter 5 of XML and Java Working with XML SAX Filters as described in Chapter 5. Finding a Pattern using SAX. <?xml version="1.0" encoding="utf-8"?> <department> <employee id="J.D"> <name>John Doe</name> <email>John.Doe@foo.com</email>
E N D
95-733 Week 5 Basic SAX Example From Chapter 5 of XML and Java Working with XML SAX Filters as described in Chapter 5
Finding a Pattern using SAX <?xml version="1.0" encoding="utf-8"?> <department> <employee id="J.D"> <name>John Doe</name> <email>John.Doe@foo.com</email> </employee> <employee id="B.S"> <name>Bob Smith </name> <email>Bob.Smith@foo.com</email> </employee> </department> department.xml
TextMatch.java import java.io.IOException; import java.util.Stack; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLReaderFactory; public class TextMatch extends DefaultHandler { StringBuffer buffer; String pattern; Stack context;
public TextMatch(String pattern) { this.buffer = new StringBuffer(); this.pattern = pattern; this.context = new Stack(); }
protected void flushText() { if (this.buffer.length() > 0) { String text = new String(this.buffer); if (pattern.equals(text)) { System.out.print("Pattern '"+this.pattern +"' has been found around "); for (int i = 0; i < this.context.size(); i++) { System.out.print("/"+this.context.elementAt(i)); } System.out.println(""); } } this.buffer.setLength(0); }
public void characters(char[] ch, int start, int len) throws SAXException { this.buffer.append(ch, start, len); } public void ignorableWhitespace(char[] ch, int start, int len) throws SAXException { this.buffer.append(ch, start, len); } public void processingInstruction(String target, String data) throws SAXException { // Nothing to do because PI does not affect the meaning // of a document. }
public void startElement(String uri, String local, String qname, Attributes atts) throws SAXException { this.flushText(); this.context.push(local); } public void endElement(String uri, String local, String qname) throws SAXException { this.flushText(); this.context.pop(); }
public static void main(String[] argv) { if (argv.length != 2) { System.out.println("TextMatch <pattern> <document>"); System.exit(1); } try { XMLReader xreader = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); xreader.setContentHandler(new TextMatch(argv[0])); xreader.parse(argv[1]); } catch (IOException ioe) { ioe.printStackTrace(); } catch (SAXException se) { se.printStackTrace(); } } } The XMLReader interface declares setContentHandler and parse.
<?xml version="1.0" encoding="utf-8"?> <department> <employee id="J.D"> <name>John Doe</name> <email>John.Doe@foo.com</email> </employee> <employee id="B.S"> <name>Bob Smith </name> <email>Bob.Smith@foo.com</email> </employee> </department> Looking for Bob.Smith@foo.com
D:\McCarthy\www\95-733\examples\chap05>java TextMatch "Bob.Smith@foo.com" Department.xml Pattern 'Bob.Smith@foo.com' has been found around /department/employee/email
Filtering XML Perhaps we would like to modify an existing XML document. Or, perhaps we would like to generate and XML document from a flat file or Database. We’ll look at six examples that will make the filtering process clear.
XMLReader Notes from JDK 1.4 Documentation
org.xml.sax Interface XMLReader XMLReader is the interface that an XML parser's SAX2 driver must implement. This interface allows an application to set and query features and properties in the parser, to register event handlers for document processing, and to initiate a document parse. Notes from JDK 1.4 Documentation
org.xml.sax Interface XMLReader Two example methods declared in this interface are: voidsetDTDHandler(DTDHandler handler) Allow an application to register a DTD event handler. voidparse(InputSource input) Parse an XML document. Notes from JDK 1.4 Documentation
XMLReader Create XMLReader. Tell it what to parse. Tell it where its contentHandler is. Tell it to parse. parse XML source setContenthandler contentHandler
XMLFilter Notes from JDK 1.4 Documentation
org.xml.XMLFilter Interface An XML filter is like an XML reader, except that it obtains its events from another XML reader rather than a primary source like an XML document or database. Filters can modify a stream of events as they pass on to the final application. For example, the Filter might set its own contentHandler. The parser will call that one. This intervening handler can be programmed to call the application’s handler. Thus, the calls from the parser to the handler are filtered. Notes from JDK 1.4 Documentation
XMLFilter package org.xml.sax; public interface XMLFilter extends XMLReader { // This method allows the application to link // the filter to a parent reader (which may // be another filter). The argument may not be null. public void setParent(XMLReader parent); Notes from JDK 1.4 Documentation
// This method allows the application to query the // parent reader (which may be another filter). // It is generally a bad idea to perform any // operations on the parent reader directly: // they should all pass through this filter. public XMLReader getParent(); } Notes from JDK 1.4 Documentation
XMLFilter XMLReader Interface XMLFilter Interface 14 Methods 14 XMLReader Methods + 2
XMLFilter XMLReader Object XMLFilter Object All methods of XMLReader are here. They may block, pass on, or modify the calls to the parent
org.xml.sax.helpers Class XMLFilterImpl • All Implemented Interfaces: • ContentHandler, DTDHandler, EntityResolver, ErrorHandler, • XMLFilter, XMLReader All XMLReader methods are defined. These methods, by default, pass calls to the parent XMLReader. By default, the XMLReader is set to call methods defined here, in XMLFilterImpl, for XML content.
org.xml.sax.helpers Class XMLFilterImpl This class is designed to sit between an XMLReader and the client application's event handlers. By default, it does nothing but pass requests up to the reader and events on to the handlers unmodified, but subclasses can override specific methods to modify the event stream or the configuration requests as they pass through. A Constructor – XMLFilterImpl(XMLReader parent) Construct an XML filter with the specified parent. Notes from JDK 1.4 Documentation
Some Examples Using Filters // Filter demon 1 // A very simple SAX program import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import java.io.IOException; import org.xml.sax.SAXException;
public class MainDriver { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // Get a handler MyHandler myHandler = new MyHandler(); // Tell the parser about the handler parser.setContentHandler(myHandler); // Parse the input document parser.parse(argv[0]); } }
class MyHandler extends DefaultHandler { // Handle events from the parser public void startDocument() throws SAXException { System.out.println("startDocument is called:"); } public void endDocument() throws SAXException { System.out.println("endDocument is called:"); } } D:\McCarthy\www\95-733\examples\xmlfilter>java MainDriver department.xml startDocument is called: endDocument is called:
Filter Demo 2 // Filter demon 2 // Adding an XMLFilterImpl that does nothing but supply // an object that acts as an intermediary. import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLFilterImpl; import java.io.IOException; import org.xml.sax.SAXException;
public class MainDriver2 { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // Get a handler MyHandler myHandler = new MyHandler();
// Get a filter – and pass a pointer to the parser XMLFilterImpl myFilter = new XMLFilterImpl(parser); // After we create the XMLFilterImpl, all of the calls we make // on the parser will go through the filter. For example, we will // call setContentHandler on the filter and not the parser. // When we create the filter (it implements many interfaces), // the parser will call filter methods first. These methods will, // in turn, call our methods. // Tell the XMLFilterImpl about the handler myFilter.setContentHandler(myHandler); // Parse the input document myFilter.parse(argv[0]); } }
class MyHandler extends DefaultHandler { // Handle events from the parser public void startDocument() throws SAXException { System.out.println("startDocument is called:"); } public void endDocument() throws SAXException { System.out.println("endDocument is called:"); } }
D:\McCarthy\www\95-733\examples\xmlfilter> java MainDriver2 department.xml startDocument is called: endDocument is called:
Filter Demo 3 // Filter demon 3 // Adding an XMLFilterImpl import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLFilterImpl; import java.io.IOException; import org.xml.sax.SAXException;
class MyCoolFilterImpl extends XMLFilterImpl { public MyCoolFilterImpl(XMLReader parser) { super(parser); } // There are two startDocument methods in this // class. This one overrides the inherited method. // The inherited method calls the outside // contentHandler. // The parser calls this method, this method calls // the base class method wich calls the outside handler. public void startDocument() throws SAXException { System.out.println("Inside filter"); super.startDocument(); System.out.println("Leaving filter"); }
public void endDocument() throws SAXException { System.out.println("Inside filter"); super.startDocument(); System.out.println("Leaving filter"); } }
public class MainDriver3 { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // Get a handler MyHandler myHandler = new MyHandler(); // Get a filter that we will treat as a parser XMLFilterImpl myFilter = new MyCoolFilterImpl(parser);
// Tell the XMLFilterImpl about the handler myFilter.setContentHandler(myHandler); // Parse the input document myFilter.parse(argv[0]); } } class MyHandler extends DefaultHandler { // Handle events from the parser public void startDocument() throws SAXException { System.out.println("startDocument is called:"); } public void endDocument() throws SAXException { System.out.println("endDocument is called:"); } }
D:\McCarthy\www\95-733\examples\xmlfilter> java MainDriver3 department.xml Inside filter startDocument is called: Leaving filter Inside filter startDocument is called: Leaving filter
Filter Demo 4 // Filter demon 4 // Passing xml to an XMLSerializer import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLFilterImpl; import java.io.FileOutputStream; import java.io.IOException; import org.xml.sax.SAXException; import org.apache.xml.serialize.XMLSerializer; // not standard import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver4 { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // we need to write to a file FileOutputStream fos = new FileOutputStream("Filtered.xml"); // An XMLSerializer can collect SAX events XMLSerializer xmlWriter = new XMLSerializer(fos, null);
// Tell the parser about the handler (XMLSerializer) parser.setContentHandler(xmlWriter); // Parse the input document // The parser sends events to the XMLSerializer parser.parse(argv[0]); } }
D:\McCarthy\www\95-733\examples\xmlfilter> java MainDriver4 department.xml D:\McCarthy\www\95-733\examples\xmlfilter>type filtered.xml <?xml version="1.0"?> <department> <employee id="J.D"> <name>John Doe</name> <email>John.Doe@foo.com</email> </employee> <employee id="B.S"> <name>Bob Smith</name> <email>Bob.Smith@foo.com</email> </employee> <employee id="A.M"> <name>Alice Miller</name> <url href="http://www.foo.com/~amiller/"/> </employee> </department>
Filter Demo 5 // Filter demon 5 // Placing a filter between the parser and the // XMLSerializer import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLFilterImpl; import java.io.FileOutputStream; import java.io.IOException; import org.xml.sax.SAXException; import org.apache.xml.serialize.XMLSerializer; // not standard import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver5 { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); // we need to write to a file FileOutputStream fos = new FileOutputStream("Filtered.xml"); // An XMLSerializer can collect SAX events XMLSerializer xmlWriter = new XMLSerializer(fos, null); // Get a filter XMLFilterImpl myFilter = new AnotherCoolFilterImpl(parser);
// Tell the XMLFilterImpl about the handler (XMLSerializer) myFilter.setContentHandler(xmlWriter); // Parse the input document myFilter.parse(argv[0]); } }
class AnotherCoolFilterImpl extends XMLFilterImpl { public AnotherCoolFilterImpl(XMLReader parser) { super(parser); } public void startDocument() throws SAXException { System.out.println("Inside filter"); super.startDocument(); System.out.println("Leaving filter"); } public void endDocument() throws SAXException { System.out.println("Inside filter"); super.endDocument(); System.out.println("Leaving filter"); } }
D:\McCarthy\www\95-733\examples\xmlfilter> java MainDriver5 department.xml Inside filter Leaving filter Inside filter Leaving filter Filtered.xml is as before.
Filter Demo 6 // Filter demo 6 // Writing our own parser and passing calls to a filter import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLFilterImpl; import java.io.FileOutputStream; import java.io.IOException; import org.xml.sax.SAXException; import org.apache.xml.serialize.XMLSerializer; // not standard import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver6 { public static void main(String[] argv) throws SAXException, IOException { // Get a parser XMLReader parser = new MyCoolParser(); // we need to write to a file FileOutputStream fos = new FileOutputStream("Filtered.xml"); // An XMLSerializer can collect SAX events XMLSerializer xmlWriter = new XMLSerializer(fos, null); // Tell the parser about the handler (XMLSerializer) parser.setContentHandler(xmlWriter);
// Parse the input document parser.parse("Some query or file name or ..."); } } class MyCoolParser extends XMLFilterImpl { public MyCoolParser() { }
public void parse(String aFileNameOrSQLQuery) throws IOException, SAXException { char[] ch = new char[10]; ch[0] = 'H'; ch[1] = 'i'; // go to a file or go to a DBMS with a query // make calls to call back methods when this // code feels it's appropriate startDocument(); startElement("", "MyNewTag", "", null); characters(ch, 0, 2); endElement("", "MyNewTag", ""); endDocument(); } }