420 likes | 627 Views
Storing XML in ORDBMS. Amine Kaddara Supervisor: Dr Haddouti. Outline. Motivation Benefits of using ORDBMS for storing XML Storage techniques using XORator algorithm JDOM API (JavaDOM) JDOM Examples JDO API(Java Data Objects) JDO Examples. Motivation.
E N D
Storing XML in ORDBMS Amine Kaddara Supervisor: Dr Haddouti
Outline • Motivation • Benefits of using ORDBMS for storing XML • Storage techniques using XORator algorithm • JDOM API (JavaDOM) • JDOM Examples • JDO API(Java Data Objects) • JDO Examples
Motivation • First, most database vendors today offer universal database products that combine their relational DBMS and ORDBMS offerings into a single product. • Second, an ORDBMS has a more expressive type system than an RDBMS. • Third, an ORDBMS is better suited for storing and querying XML documents that may use a richer set of data types.
Motivation: Applications • Computer-Aided Design (CAD) • Computer-Aided Manufacturing (CAM) • Computer-Aided Software Engineering (CASE) • Network Management Systems • Office Information Systems (OIS) and Multimedia Systems • Digital Publishing • Geographic Information Systems (GIS) • Interactive and Dynamic Web sites • Other applications with complex and interrelated objects and procedural data.
Motivation: RDBMS weaknesses • Poor Representation of “Real World” Entities • Normalization leads to relations that do not correspond to entities in “real world”. Semantic Overloading • Relational model has only one construct for representing data and data relationships: the relation. • Relational model is semantically overloaded. • Difficulty Handling Recursive Queries • RDBMSs are poor at navigational access to data. • Limited Operations • RDBMs only have a fixed set of operations which are difficult to extend.
Motivation: ORDBMS Advantages • Add object storage facilities to relational database • Greater flexibility than strict relational • Easier to introduce into organisation than full OO • Backwards compatible with strict relational applications, SQL etc • Relational paradigm retained • Tables with rows of values • But attributes can contain objects, sets, arrays, tuples etc
Motivation: ORDBMS Advantages • Code held within database, as functions, procedures or methods • common functionality can be centralised rather than re-implemented by every application that uses the data • BLOBs(Binary Large Objects) and CLOBs(Character Large Objects) are used to store large unstructured values within database • allows storage of complex data e.g. multimedia
Motivation: ORDBMS Advantages • ORDBMS • The ability to directly manipulate data stored in a relational database using an object programming language is called transparent persistence • Object-relational mapping means less code to write • Higher performance over an embedded SQL or a call interface(JDBC,ODBC)
XORator mapping • The XORator(XML to OR Translator) algorithm is a practical demonstration of the use of XML data types • It takes advantage of using an ORDBMS over an RDBMS. • XORatoruses Document Type Definitions (DTDs) to map XML documents to tables in an ORDBMS. • An important part of this mapping is the assignmentof a fragment of an XML document to a new XML data type, called XADT (XML Abstract Data Type).
XORator: DTD -> OR schema • Reducing the DTD complexity • Building DTD graph • Mapping DTD to OR schema • Defining XADT(XML Abstract Data Types)
XORator: DTD -> OR schema • <!ELEMENT PLAY (INDUCT?, ACT+)> • <!ELEMENT INDUCT (TITLE, SUBTITLE*, SCENE+)> • <!ELEMENT ACT (SCENE+, TITLE, SUBTITLE*, SPEECH+, PROLOGUE?)> • <!ELEMENT SCENE (TITLE, SUBTITLE*, (SPEECH | SUBHEAD)+)> • <!ELEMENT SPEECH (SPEAKER, LINE)+> • <!ELEMENT PROLOGUE (#PCDATA)> • <!ELEMENT TITLE (#PCDATA)> • <!ELEMENT SUBTITLE (#PCDATA)> • <!ELEMENT SUBHEAD (#PCDATA)> • <!ELEMENT SPEAKER (#PCDATA)> • <!ELEMENT LINE (#PCDATA)>
XORator: DTD complexity • Simplify the DTD information to a form that makes the mapping process easier. • Set of transformations to reduce the number of nested expressions and the number of element items: • Flattening (to convert a nested definition into a flat representation): (e1,e2)* -> e1, e2 • Simplification (to reduce multiple unary operators into a single unary operator) : e1**->e1* • Grouping (to group subelements that have the same name): e0; e1*; e1*; e2 -> e0; e1*; e2 • In addition, e+ is transformed to e*.
XORator: DTD -> OR schema • The simplified version of the previous DTD • <!ELEMENT PLAY (INDUCT?, ACT*)> • <!ELEMENT INDUCT (TITLE, SUBTITLE*, SCENE*) • <!ELEMENT ACT (SCENE*, TITLE, SUBTITLE*, SPEECH*, PROLOGUE?)> • <!ELEMENT SCENE (TITLE, SUBTITLE*, SPEECH*, SUBHEAD*)> • <!ELEMENT SPEECH (SPEAKER*, LINE*)> • <!ELEMENT PROLOGUE (#PCDATA)> • <!ELEMENT TITLE (#PCDATA)> • <!ELEMENT SUBTITLE (#PCDATA)> • <!ELEMENT SUBHEAD (#PCDATA)> • <!ELEMENT SPEAKER (#PCDATA)> • <!ELEMENT LINE (#PCDATA)>
XORator: DTD -> OR schema • we build a DTD graph to represent the structure of the DTD. • Nodes in the DTD graph are elements, attributes, and operators. • In the DTD graph, elements that contain characters are duplicated to eliminate the sharing.
XORator: DTD -> OR schema • Given an DTD graph, a relation is created for nodes that satisfy any of these following conditions: 1) nodes that have an in-degree of zero 2) recursive nodes with in-degree greater than one 3) one node among mutually recursive nodes with in-degree one. 4) All remaining nodes (nodes not mapped to a relation) are inlined as attributes under the relation created for their closest ancestor nodes (in the DTD graph).
XORator: DTD -> OR schema • An XADT attribute can store a fragment of an XML document • The XORator algorithm allows mapping an entire subtree of the DTD graph to an attribute of the XADT.
XORator: XADT • A storage representation is to use a compressed representation for each XML fragment. • The element tags are mapped to integer codes, and element tags are replaced by these integer codes. • A small dictionary is stored along with the XML fragment to record the mapping between the integer codes and the actual element tag names. • There is two implementations of the XADT: one that uses compression, and the other one that does not.
XORator: XADT • The decision to use the “correct” implementation of the XADT is made during the document transformation process by monitoring the effectiveness of the compression technique. • Compression is used only if the space efficiency is above a certain threshold value.
XORator: XADT • XADT getElm(XADT inXML, VARCHAR rootElm, VARCHAR searchElm, VARCHAR searchKey, INTEGER level): • This Method returns all rootElm elements that have searchElm within a depth of level from the rootElm. • INTEGER findKeyInElm(XADT inXML, VARCHAR searchElm, VARCHAR searchKey): • This method examines all elements with the tag name searchElm in inXML, and searches for all searchElm elements with content that matches the searchKey keyword and returns 1 if true • XADT getElmIndex(XADT inXML, VARCHAR parentElm, VARCHAR childElm, INTEGER startPos, INTEGER endPos): • This method returns all childElm elements that are children of the parentElm elements and with the sibling order from startPos to endPos positions.
XORator: XADT • This query retrieves lines that are spoken in acts by the ‘SPEAKER’ ‘HAMLET’ and have the keyword ‘friend’ in the line.
JDOM • JDOM is an open source, tree-based(DOM), pure Java API for parsing, creating, manipulating, and serializing XML documents • JDOM represents an XML document as a tree composed of elements, attributes, comments, processing instructions, text nodes, CDATA sections,etc.. • JDOM is written in and for Java. It consistently uses the Java coding conventions and the class library and it implemets the cloenable and serializable interfaces
JDOM • Xerces 1.4.4 is bundled with JDOM to parse XML documents. • A JDOM tree is fully read-write. All parts of the tree can be moved, deleted, and added to, subject to the usual restrictions of XML. • Unlike DOM, there are no annoying read-only sections of the tree that one can’t change.
JDOM Example <person> <name>Michael Owen</name> <address>222 Bazza Lane, Liverpool, MN</address> <ssn>111-222-3333</ssn> <email>michael@owen.com</email> <home-phone>720.111.2222</home-phone> <work-phone>111.222.3333</work-phone> </person>
JDOM Example public class Person { private String name; private String address; private String ssn; private String email; private String homePhone; private String workPhone;// -- allows us to create a Person public Person(String name, String address, String ssn, String email, String homePhone, String workPhone) { this.name = name; this.address = address; this.ssn = ssn; this.email = email; this.homePhone = homePhone; this.workPhone = workPhone; }// -- used by the data-binding
JDOM Example public Person() { }// -- accessors public String getName() { return name; } public String getAddress() { return address; } public String getSsn() { return ssn; } public String getEmail() { return email; } public String getHomePhone() { return homePhone; } public String getWorkPhone() { return workPhone; }// -- mutators public void setName(String name) { this.name = name; } public void setAddress(String address) { this.address = address; } public void setSsn(String ssn) { this.ssn = ssn; } public void setEmail(String email) { this.email = email; } public void setHomePhone(String homePhone) { this.homePhone = homePhone; } public void setWorkPhone(String workPhone) { this.workPhone = workPhone; }
JDOM Example import org.exolab.castor.xml.*; import java.io.FileReader; public class ReadPerson { public static void main(String args[]) { try { Person person = (Person) Unmarshaller.unmarshal(Person.class, new FileReader("person.xml")); System.out.println("Person Attributes"); System.out.println("-----------------"); System.out.println("Name: " + person.getName() ); System.out.println("Address: " + person.getAddress() ); System.out.println("SSN: " + person.getSsn() ); System.out.println("Email: " + person.getEmail() ); System.out.println("Home Phone: " + person.getHomePhone() ); System.out.println("Work Phone: " + person.getWorkPhone() ); } catch (Exception e) { System.out.println( e ); } } }
JDOM Example import org.exolab.castor.xml.*; import java.io.FileWriter; public class CreatePerson { public static void main(String args[]) { try {// -- create a person to work with Person person = new Person("Bob Harris", "123 Foo Street", "222-222- 2222", "bob@harris.org", "(123) 123-1234", "(123) 123-1234");// -- marshal the person object out as a <person> FileWriter file = new FileWriter("bob_person.xml"); Marshaller.marshal(person, file); file.close(); } catch (Exception e) { System.out.println( e ); } } }
JDOM Example import org.exolab.castor.xml.*;import java.io.FileWriter; import java.io.FileReader; public class ModifyPerson { public static void main(String args[]) { try {// -- read in the person Person person = (Person) Unmarshaller.unmarshal(Person.class, new FileReader("person.xml"));// -- change the name person.setName("David Beckham");// -- marshal the changed person back to disk FileWriter file = new FileWriter("person.xml"); Marshaller.marshal(person, file); file.close(); } catch (Exception e) { System.out.println( e ); } }}
JDO • Sun's Java Data Objects (JDO) standard. • JDO allows you to persist Java objects. • It supports transactions and multiple users. It differs from JDBC in that you don't have to think about SQL and "all that database stuff." • It differs from serialization as it allows multiple users and transactions. • It allows Java developers to use their object model as a data model. There is no need to spend time going between the "data" side and the "object" side.
JDO: Example package addressbook; import java.util.*;//OF Import javax.jdo.*; Importcom.prismt.j2ee.connector.jdbc.ManagedConnectionFactoryImpl; public class PersonPersist{ private final static int SIZE = 3; private PersistenceManagerFactory pmf = null; private PersistenceManager pm = null; private Transaction transaction = null; private Person[] people; // Vector of current object identifiers private Vector id = new Vector(SIZE); public PersonPersist() { try { Properties props = new Properties(); props.setProperty("javax.jdo.PersistenceManagerFactoryClass", "com.prismt.j2ee.jdo.PersistenceManagerFactoryImpl"); pmf = JDOHelper.getPersistenceManagerFactory(props); pmf.setConnectionFactory( createConnectionFactory() ); } catch(Exception ex) { ex.printStackTrace(); System.exit(1); } }
JDO: Example public static Object createConnectionFactory() { ManagedConnectionFactoryImpl mcfi = new ManagedConnectionFactoryImpl(); Object connectionFactory = null; try { mcfi.setUserName("scott"); mcfi.setPassword("tiger"); mcfi.setConnectionURL("jdbc:oracle:thin:@localhost:1521:thedb"); mcfi.setDBDriver("oracle.jdbc.driver.OracleDriver"); connectionFactory = mcfi.createConnectionFactory(); } catch(Exception e) { e.printStackTrace(); System.exit(1); } return connectionFactory; }
JDO: Example public void persistPeople() { // create an array of Person's people = new Person[SIZE]; // create three people people[0] = new Person("Gary Segal", "123 Foobar Lane“,"123-123-1234", "gary@segal.com", "(608) 294-0192", "(608) 029-4059"); people[1] = new Person("Michael Owen", "222 Bazza Lane, Liverpool, MN", "111-222-3333", "michael@owen.com", "(720) 111-2222", "(303) 222-3333"); people[2] = new Person("Roy Keane", "222 Trafford Ave, Manchester, MN", "234-235-3830", "roy@keane.com", "(720) 940-9049", "(303) 309-7599)"); // persist the array of people pm = pmf.getPersistenceManager(); transaction = pm.currentTransaction(); pm.makePersistentAll(people); transaction.commit(); // retrieve the object ids for the persisted objects for(int i = 0; i < people.length; i++) { id.add(pm.getObjectId(people[i])); } // close current persistence manager to ensure that // objects are read from the db not the persistence // manager's memory cache. pm.close(); }
JDO: Example public void change() { Person person; // retrieve objects from datastore pm =pmf.getPersistenceManager(); transaction = pm.currentTransaction(); // change DataString field of the second persisted object person=(Person)pm.getObjectById(id.elementAt(1, false); person.setName("Steve Gerrard"); // commit the change and close the persistence manager transaction.commit(); pm.close(); }
JDOM Example • <addressbook name="Manchester United Address Book"> <person name="Roy Keane"> <address>23 Whistlestop Ave</address> <ssn>111-222-3333</ssn> <email>roykeane@manutd.com</email> <home-phone>720.111.2222</home-phone> <work-phone>111.222.3333</work-phone> </person> <person name="Juan Sebastian Veron"> <address>123 Foobar Lane</address> <ssn>222-333-444</ssn> <email>juanveron@manutd.com</email> <home-phone>720.111.2222</home-phone> <work-phone>111.222.3333</work-phone> </person></addressbook>
JDOM: Example import java.util.List; import java.util.ArrayList; public class Addressbook { private String addressBookName; private List persons = new ArrayList(); public Addressbook() { }// -- manipulate the List of Person public void addPerson(Person person) { persons.add(person); } public List getPersons() { return persons; } // -- manipulate the name of the address book public String getName() { return addressBookName; } public void setName(String name) { this.addressBookName = name; } }
JDOM Example • <?xml version="1.0"?><mapping><description>A mapping file for our Address Book application</description><class name="Person"> <field name="name" type="string"> <bind-xml name="name" node="attribute" /> </field> <field name="address" type="string" /> <field name="ssn" type="string" /> <field name="email" type="string" /> <field name="homePhone" type="string" /> <field name="workPhone" type="string" /></class><class name="Addressbook"> <field name="name" type="string"> <bind-xml name="name" node="attribute" /> </field> <field name="persons" type="Person" collection="collection" /></class></mapping>
JDOM Example import org.exolab.castor.xml.*; import org.exolab.castor.mapping.*; import java.io.FileReader; import java.util.List; import java.util.Iterator; public class ViewAddressbook { public static void main(String args[]) { try { // -- Load a mapping file Mapping mapping = new Mapping(); mapping.loadMapping("mapping.xml"); Unmarshaller un = new Unmarshaller(Addressbook.class); un.setMapping( mapping ); // -- Read in the Addressbook using the mapping FileReader in = new FileReader("addressbook.xml"); Addressbook book = (Addressbook) un.unmarshal(in); in.close();
JDOM Example // -- Display the addressbook System.out.println( book.getName() ); List persons = book.getPersons(); Iterator iter = persons.iterator(); while ( iter.hasNext() ) { Person person = (Person) iter.next(); System.out.println("\n" + person.getName() ); System.out.println("-----------------------------"); System.out.println("Address = "+ person.getAddress()); System.out.println("SSN = " + person.getSsn() ); System.out.println("Home Phone = " + person.getHomePhone() ); } } catch (Exception e) { System.out.println( e ); } } }