560 likes | 570 Views
Learn about decoupling data access components that provide access to data stored in XML files and how to transform data between object-oriented domain models and XML models.
E N D
Methods/Technologies: Files Serialization Databases Problems: Decoupling components which provide acces to data (using particular persistency technology) from application domain components Abstract data access Transform data between OO domain model and the model used by a usually non-OO persistency techn (relational DB, files with different structure) Data persistency
Outline • Technologies • XML : • JAXP: Java API for XML Processing (SAX, DOM) • JAXB: Java Architecture for XML Binding • JDBC: API in Java for interacting (query, update) with relational databases • Pattern for mapping object oriented to relational concepts • OO concepts: aggregation, inheritance, association • Relational model: tables, foreign key references to other tables • Data Access Patterns: • The way how data is accessed depends on the type of the data storage (database, file, etc) • Components which access data are heavily dependent on the used data technologies and should be decoupled from the business domain components • Data Access Object Pattern • Variants (Data mapper, Table data gateway, Active record)
Case study: XML technologies for Java • JAXP: Java API for XML Processing (SAX, DOM) • Bibliography: • Tutorial: http://download.oracle.com/javase/tutorial/jaxp/index.html • JAXB: Java Architecture for XML Binding • Bibliography: • https://docs.oracle.com/javase/tutorial/jaxb/intro/
XML • XML is a standard for describing the structure of documents • eXtensible Markup Language • Text format => easy to use • Standardized => there are API-s that can be used for parsing (the syntactic aspects of data representation) • Applications that use XML must establish only the semantic of data representation
XML Tags • Tags • Represent metainformation included in text • Similar with HTML tags • Difference between HTML tags and XML tags: HTML tags contain information of data representation(ex: <B>), while XML tags contain information of data structure and semantics • XML tags are case-sensitive • May contain text or other tags • Tags come in pairs – start tag and end tag: • <tag> </tag> • If there is no contents between start and end tag: <tag /> • Tag Attributes • Define name-value pairs inside a tag • <dot x=“72” y=“13” />
Special characters • Examples: • < is encoded as < • > is encoded as > • & is encoded as & • “ is encoded as " • ‘ is encoded as '
Structure of a XML document • A document starts with: <?xml version='1.0' encoding='utf-8'?> • Has a tree form: • There is exactly one root element • Other elements are nested • An element is a sequence between a the start and end of a tag <person> <firstname>Ion</firstname> <lastname>Popescu</lastname> <age>30</age> <ssn>2711130345678</ssn> </person>
Representing application data with XML • Example: representing point coordinates: XML
Data representation styles: Tag-s or attributes ? • There are 2 methods: • Using attributes: <dot x=“25" y=“33"> • Using nested tags: <dot> <x>25</x> <y>33</y> </dot> • Which method ? • Attributes: if data text is short: • <dot x='65' y=‘23' /> • Tag-s: if data text is long: • <description>Acest program estefoarteutiltuturor</description> • Tag-s: if an object has a variable number of attributes: • <polygon> <point> .. </point> <point>..</point> <point>..</point> </polygon>
Example 1: XML document – data in attributes • Dots – a set of points with coord (x,y) • Root node: “dots” • Child nodes: “dot”, with attributes x and y dots.xml <?xml version="1.0" encoding="UTF-8" ?> <dots> <dot x="32" y="100" /> <dot x="17" y="14" /> <dot x="18" y="58" > </dot> </dots>
Example 2: XML document – data in nested tags • Root node: “points” • Child nodes: “point”, each having child nodes “x” si “y” An XML document is: Well-formed and Valid points.xml <?xml version="1.0" encoding="UTF-8" ?> -<points> <point> <x>12</x> <y>24</y> </point> <point> <x>22</x> <y>11</y> </point> </points>
XML documents: Well-formed • "well-formed": the document adheres to the syntax rules specified by the XML specification • All XML elements must have a closing tag. • XML tags are case-sensitive. • All XML elements must be properly nested. • All XML documents must have a root element. • Attribute values must be in quotes.
XML documents: ValidXML Schema • “Valid”: a well-formed document which additionally respects some used-defined structural constraints • Structural constraints can be expressed with help of: • XML DTD (Data Type Definition): • XML Schema (XSD) http://www.w3schools.com/schema/default.asp • Examples of structural constraints: which tags are allowed, in which order, how many times, which attributes, etc. • Validating XML parsers
Example 1: XML Schema • Example 1 - Constraints for XML files representing points: • The root element is called dots <xs:element name="dots"> • It can contain any number of dot elements • It is a complex type because it contains other elements <xs:complexType> • It contains a sequence of elements <xs:sequence> • Each element dot has 2 attributes, x and y, of integer type <xs:attribute name="x" type="xs:integer" />
Example 1: XML Schema dots.xsd <?xml version="1.0"?> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> <xs:element name="dots"> <xs:complexType> <xs:sequence> <xs:element name="dot" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="x" type="xs:integer" use="required"/> <xs:attribute name="y" type="xs:integer" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Example: XML doc with schema dots.xml <?xml version="1.0" encoding="UTF-8" ?> <dots xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="dots.xsd">> <dot x="32" y="100" /> <dot x="17" y="14" /> <dot x="18" y="58" > </dot> </dots>
Example 2: XML Schema • Example 2 - Constraints for XML files representing points: • The root element is calledpoints <xs:element name=“points"> • It can contain any number of point elements • Each point element is of complex type, as a sequence of 2 elements x and y <xs:element name="x" type="xs:integer" /> • Elements x and y are simple elements (contain only text, not other elements or attributes)
Example 2 XML Schema points.xsd <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="points"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" name="point"> <xs:complexType> <xs:sequence> <xs:element name="x" type="xs:integer" /> <xs:element name="y" type="xs:integer" /> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Support for creating XML files • XML files are text files that can be edited as such with any general-purpose text editor • It can be easier to use special XML editors which may help you to build a well-formed and even a valid document • XMLSpy • IDE’s such as Eclipse, NetBeans, Visual Studio offer editors and support for working with XML
Support for XML processing in Java • JAXP (Java API for XML Processing) • Supports processing of XML data from applications written in java • Support for XML parsing: different standards: • SAX (Simple API for XML Parsing): while parsing, events are generated to announce the parsed elements; the application handles these events by providing callback methods • DOM (Document Object Model): while parsing, a in-memory data structure is built. • Support for transforming XML documents: • XSLT (Extensible Stylesheet Language Transformation).
SAX http://download.oracle.com/javase/tutorial/jaxp/intro/simple.html
Reading XML data with SAX • Example: XMLDotReader – reads data from file dots.xml • // standard imports for SAX • import java.io.*; • import java.util.*; • import javax.xml.parsers.*; • import org.xml.sax.*; • import org.xml.sax.helpers.*; • // Implementing a ContentHandler to handle the SAX events • public class XMLDotReader extends DefaultHandler { • … • }
Constructing the SAX parser • // create an instance of the ContentHandler • DefaultHandler handler= new XMLDotReader(); • SAXParserFactory factory = SAXParserFactory.newInstance(); • try { • // use the default non-validating parser • SAXParser saxParser = factory.newSAXParser(); • saxParser.parse(new File(“dots.xml”), handler); • } catch (Exception ex) { • ex.printStackTrace(); • }
Types of events in SAX • public void startDocument() throws SAXException ; • public void endDocument() throws SAXException ; • // Called at start of each element • public void startElement(String namespaceURI, String localName, • String qName, Attributes atts) throws SAXException ; • // Called at the end of each element • public void endElement(java.lang.String uri, java.lang.String localName, java.lang.String qName) throws SAXException; • // Called for characters between nodes. • public void characters(char buf[], int offset, int len) throws SAXException;
Handling SAX events • public class XMLDotReader extends DefaultHandler { • … • public void startElement(String namespaceURI, String localName, • String qName, Attributes atts) throws SAXException { • System.out.println("start element:" + qName); • if (qName.equals("dot")) { • x = Integer.parseInt(atts.getValue("x")); • y = Integer.parseInt(atts.getValue("y")); • System.out.println(x + ", " + y); • } • } • }
Example1 – source code • Course web page: XMLDotsReader.java • http://staff.cs.upt.ro/~ioana/arhit-engl/2017/xml/XMLDotReader.java
Program output startDocument start element:dots start element:dot dot: 32, 100 end element:dot start element:dot dot: 17, 14 end element:dot start element:dot dot: 18, 58 end element:dot end element:dots endDocument dots.xml <?xml version="1.0" encoding="UTF-8" ?> <dots> <dot x="32" y="100" /> <dot x="17" y="14" /> <dot x="18" y="58" > </dot> </dots>
Program output • If we modify dots.xml such that it is not well-formed (remove an end-tag) startDocument start element:dots start element:dot dot: 32, 100 end element:dot start element:dot dot: 17, 14 start element:dot dot: 18, 58 end element:dot org.xml.sax.SAXParseException: The end-tag for element type "dot" must end with a '>' delimiter. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAX ParseException(Unknown Source)
Program output • If we modify file dots.xml by replacing a tad dot with dotu (not compliant with the schema) startDocument start element:dots start element:dot dot: 32, 100 end element:dot start element:dotu end element:dotu start element:dot dot: 18, 58 end element:dot end element:dots endDocument
Validating parser • public class ValidatingXMLDotReader extends DefaultHandler { • // changes to create a validating parser • Static final String JAXP_SCHEMA_LANGUAGE= "http://java.sun.com/xml/jaxp/properties/schemaLanguage"; • static final String W3C_XML_SCHEMA = "http://www.w3.org/2001/XMLSchema"; • SAXParserFactory factory = SAXParserFactory.newInstance(); • try { • factory.setValidating(true); • factory.setNamespaceAware(true); • SAXParser saxParser = factory.newSAXParser(); • saxParser.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA); • saxParser.parse(new File(“dots.xml”), handler); • } catch (Exception ex) { • ex.printStackTrace(); • }
Validating parser - error events • public void error(SAXParseException e) • throws SAXParseException • { • throw e; • }
SAX exceptions • try { • … • }catch (SAXParseException spe) { • // Error generated by the parser • System.out.println(“Parsing Error: line ”+spe.getLineNumber()+” , ”+spe.getMessage()); • } catch (SAXException sxe) { • // Error generated by application or parser initialization • } catch (ParserConfigurationException pce) { • // parser with specified options cann’t be built • }catch (IOException ioe) { • }catch (Throwable t) { • }
Example 2 - code • Course web page: ValidatingXMLDotsReader.java • http://staff.cs.upt.ro/~ioana/arhit-engl/2017/xml/ValidatingXMLDotReader.java
Program output • Running ValidatingXMLDotReader on an invalid dots.xml file startDocument start element:dots start element:dot 32, 100 end element:dot ** Parsing error, line 5, uri file:/C:/Documents%20and%20Settings/user/Desktop/x ml-marti/dots.xml cvc-complex-type.2.4.a: Invalid content was found starting with element 'dotu '. One of '{dot}' is expected. org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'dotu'. One of '{dot}' is expected. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAX ParseException(Unknown Source) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(Unk nown Source)
DOM http://download.oracle.com/javase/tutorial/jaxp/intro/dom.html
Element Node • An element/node corresponds with a section between <tag>… </tag> • A node may contain child nodes • A node ay have attributes • A document has a root node
Read XML data with DOM • // Standard imports for XML • import javax.xml.parsers.*; • import org.xml.sax.*; • import org.xml.sax.helpers.*; • import org.w3c.dom.*; • ....
Construct XML DOM parser • DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); • try { • DocumentBuilder db = dbf.newDocumentBuilder(); • // Gets default non-validating parser • // Parse the XML to build the whole doc tree • Document doc = db.parse(new File(“dots.xml”)); • } catch (SAXParseException spe) { // Error handling code the same as with SAX • } catch (SAXException sxe) { • } catch (ParserConfigurationException pce) { • } catch (IOException ioe) { • } catch (Throwable t) { • }
Walk through the DOM • // Get root node of document • Element root = doc.getDocumentElement(); • // Get list of children of given tag name • NodeList list = root.getElementsByTagName(“dot"); • // Number of children in list • int len = list.getLength(); • // Get nth child • Element elem = (Element) list.item(n); • // Get an attribute out of a element • // (returns "" if there is no such attribute) • String s = elem.getAttribute(“x");
Modify the DOM in memory • // Create a new node (still needs to be added) • Element elem = document.createElement(“dot"); • // Append a child node to an existing node • node.appendChild(elem); • // Set an attribute/value binding in a node. • elem.setAttribute(“x”, “12”);
XSLT API lets you transform XML into other forms A TransformerFactory object is instantiated and used to create a Transformer. The source object is the input to the transformation process. A source object can be created from a SAX reader, from a DOM, or from an input stream. The result object is the result of the transformation process. That object can be a SAX event handler, a DOM, or an output stream. XSLT (The Extensible Stylesheet Language Transformations APIs ) http://download.oracle.com/javase/tutorial/jaxp/xslt/index.html
Write a DOM from memory in XML • import javax.xml.transform.*; • import javax.xml.transform.dom.*; • import javax.xml.transform.stream.*; • // Document doc exists already in memory at this point … • try { • TransformerFactory tranFact = TransformerFactory.newInstance( ); • Transformer tran = tranFact.newTransformer( ); • DOMSource DSource = new DOMSource(doc); • StreamResult SResult = new StreamResult(new FileOutputStream(“copie.xml”)); • tran.transform(DSource, SResult); • } catch (TransformerConfigurationException tce) { • } catch (TransformerException te) { • }
Example3 - code • Course web page: XMLDotsDOM.java • http://staff.cs.upt.ro/~ioana/arhit-engl/2017/xml/XMLDotsDOM.java
Conclusions: XML: Advantages/disadvantages • Standard format • Text files – editable, readable directly • Big and Slow: • Data representation in text format needs a lot of space ! • Good for: configuration files, data transport format • Not good for: storing a big database
Conclusions: SAX vs DOM • SAX: • “de facto standard” by the XML-DEV community http://www.saxproject.org/ • Parsing is done “online” while reading the document • Parsing is quick and does not need a lot of memory • Applications must manage their own data model • Easy to use in a state-independent case, difficult for state-dependent processing • DOM: • Standard model defined by W3C (model independent of language) http://www.w3.org/DOM/ • Parsing builds a model of the entire document in memory => slow, needs memory • The model in memory can be transformed and saved again as an XML file
Conclusions: JAXP • JAXP: Java API for XML processing: • http://download.oracle.com/javase/tutorial/jaxp/index.html • Translates reference models for XML parsing in API’s in Java • Does not impose an implementation for the parses (ParserFactory-can be configurated through properties javax.xml.parsers.SAXParserFactory and javax.xml.parsers.DocumentBuilderFactory )
Conclusions: XML: other methods for parsing... • StAX: • Standard JSR 173: Streaming API for XML http://jcp.org/en/jsr/detail?id=173 • Implemented in JAXP as well http://download.oracle.com/javase/tutorial/jaxp/stax/index.html • event-driven, like SAX • Produces output incrementally • Different from SAX (which is push-parsing) StAX is based on a pull-parsing model • Bidirectional: supports both reading(parsing) as well as writing(generating) XML documents • JDOM: http://jdom.org/ • DOM4J: http://dom4j.sourceforge.net/
Conclusions: XML parsing or ... • The context: XML is used for data persistency: • State of objects is saved and restored from XML => this can be done byparsing, respective generating XML • Problem: programmed writes a lot of “standard” code => could this code be automatically generated ? Class Dots Class Dots instance XML Dots schema XML file complying with Dots Schema
XML data binding • XML data binding: refers to a means of representing information in a XML document as a business object in memory • Tool-s can automatize the process of XML data binding: they create mappings between elements in a XML schema and the fields of a class • Example: JAXB Class Dots Class Dots instance XML Dots schema XML file complying with Dots Schema
JAXB • The Java Architecture for XML Binding (JAXB) • “provides a fast and convenient way to bind between XML schemas and Java representations” • http://download.oracle.com/javase/6/docs/technotes/guides/xml/jaxb/index.html http://www.oracle.com/technetwork/articles/javase/index-140168.html