1 / 31

XJ: Facilitating XML Processing in Java

This presentation discusses XML syntax, semantics, applications, XML Schema, XPath query, and the integration of XML and Java in the XJ framework.

ypearson
Download Presentation

XJ: Facilitating XML Processing in Java

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XJ: Facilitating XML Processing in Java Presented By: Tamar Aizikowitz Winter 2006/2007 Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski Vivek Sarkar 14th World Wide Web Conference (WWW2005), Chiba, Japan

  2. first John person last Lennon XML • Syntax:<person> <first>John</first> <last>Lennon</last></person> • Semantics: • Applications: The future web? XHTML? RSS? • Problem: Supposedly human readable and writable, but not really… • Markup language • Tags define elements • Elements contain other elements • Elements contain data

  3. XML Schema • XML based alternative to DTDs. • Describes structure of XML document. • Programmer defines valid structure of data by defining element types. • Support for standard and user defined types. <xs:element name=“person” type=“personInfo”><xs:complexType name=“personInfo”> <xs:sequence> <xs:element name=“first” type=“xs:string”/> <xs:element name=“last” type=“xs:string”/> </xs:sequence></xs:complexType>

  4. XPath Query XML Tree XML Node Sequence XPath Query Processor XPath • Query language for selecting a sequence of nodes from an XML document. • Filtering of result nodes using predicates. • Example://person[last=“Lennon”]/first

  5. XJ Introduction • Developed at the IBM Watson Research Center. • More information: http://www.research.ibm.com/xj/. Java 1.0 XJ xjc compiler xj runtime environment Java 1.1 Java 1.4 Java 1.5

  6. XJ Holy Grail:Smooth Java/XML integration • XML Trees • Just like 3, “Hello” and other values. • XML Schema • Just like Java classes. • XPath Queries • Just like [], ?: and other Java operators. • Smart Compiler • Optimization…. Improved efficiency.

  7. Example: Music Library musicLibrary album album album title stars artist artist string [1-5] string string

  8. Music Library Schema <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="musicLibrary"> <xs:complexType> <xs:sequence> <xs:element name="album" maxOccurs="unbounded"> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

  9. Music Library Schema - Album <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="stars“/> <xs:simpleType> <xs:restriction base ="xs:integer"/> <xs:pattern value =“[1-5]"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="artist" type="xs:string" maxOccurs="unbounded"> </xs:sequence> </xs:complexType>

  10. Music Library Data <?xml version="1.0" encoding="UTF-8"?> <musicLibrary> <album> <title>Abbey Road</title> <stars>4</stars> <artist>The Beatles</artist> </album> <album> <title>Sounds of Silence</title> <stars>4</stars> <artist>Paul Simon</artist> </album>

  11. The XJ Type Hierarchy java.lang.Object com.ibm.xj.XMLObject com.ibm.xj.Sequence com.ibm.xj.XMLCursor com.ibm.xj.io.XMLOutputStream com.ibm.xj.XMLElement com.ibm.xj.XMLAtomic All Element Classes All Atomic Classes com.ibm.xj.io.XMLDocumentOutputStream

  12. The XMLObject Class and Subclasses • XMLObject corresponds to an XML node. • Schema import creates subclasses of XMLElement and XMLAtomic for every element declaration. • XPath expressions evaluated on instances of these classes. com.ibm.xj.XMLObject com.ibm.xj.XMLElement com.ibm.xj.XMLAtomic All Element Classes All Atomic Classes

  13. XMLSequence and XMLCursor • Instance of Sequence is ordered list of XMLObject. • XPath expression result is instance of Sequence. • XMLCursor implements java.utils.Iterator. Used to iterate over instances of Sequence. • Support limited genericity (as defined in Java 5.0) for type checking. java.lang.Object com.ibm.xj.Sequence com.ibm.xj.XMLCursor

  14. Importing Schema Definitions • The integration of XML Schema in XJ is built on the following correspondence: • XML Schema ~ Java Package • XML Element ~ Logical Class • Nested (local) Element ~ Nested Class • Atomic types ~ Class + Auto Unboxing

  15. Schema ~ Package • Element declarations are integrated into the Java type system as “logical classes”. • XML documents are well typed XML values that are instances of these classes. • Syntax:import musicLibrary.*;

  16. XML Element ~ Class • Elements represented as subclasses of XMLObject. • May be used wherever a class type is expected. • Constructed with the new() operator. • Nested elements represented as nested classes. • Syntax:musicLibrary ml = new musicLibrary(...);musicLibrary.album a = new musicLibrary.album(...);

  17. Atomic Types • Support for XML Schema built-in atomic types such as xsd:integer and xsd:string. • Represented as subclasses of XMLAtomic. • Syntax:xsd.integer • Subtyping:xsd.short s = ...;xsd.integer i = s; • Automatic unboxing:xsd.string xstr = ...;string s = xstr;

  18. Creating XML Objects • Mechanisms for constructing XML: • External source • Literal XML embedded in an XJ program • XMLElement constructors: • XMLElement(java.io.InputStream) • XMLElement(java.io.File) • XMLElement(java.net.URL) • XMLElement(literal XML)

  19. Inline Construction of XML • XML data construction using literal XML. • Any well formed XML block can be used. • Example: title a = new title(<title>Greatest Hits</title>); • { and } used to insert runtime values: title buildTitle(string t) { title newT = new title(<title>{t}</title>); return newT;}

  20. XML Type Validation Literal XML XML Parser Example: album a = new album(<album> <title>Let It Be</title> <stars>4</stars> <band>The Beatles</band> </album>); • To construct untyped XML, use the literal XML constructor for XMLElement. XML? No Yes Compilation Error Schema Validator Valid XML? Yes No Typed XML Object

  21. Executing XPath Queries • Syntax:context [|query|] • query= valid XPath 1.0 expression. • context= XML element. Specifies context for query evaluation. • XPath expressions evaluate to Sequence<T> • Example: string band = “The Beatles”; musicLibrary m = new musicLibrary(...); Sequence<album> b = m[|/album[artist[1]=$band]|]; $refers to variables

  22. XPath Static Semantics • XPath expressions evaluate to Sequence<T>. • T is the most specific subtype of XMLObject that the compiler can determine. • Worst case: Sequence<XMLObject> is returned. • If query result is always empty, a static error is generated. • Identified using Schema definition. • Example: title t = ...; Sequence<album> a = t[|/album|]; title has no album children

  23. XPath Runtime Semantics • Evaluated with respect to context specifier value. • If the context specifier is a Sequence, each member is used as a context node in turn. • Value is union of results. musicLibrary m = new musicLibrary(...); Sequence<album> albums = m[|/album|]; Sequence<artist> artists = albums[|/artist|]; • If the result is not a node set, a sequence of appropriate type is returned. • For example: Sequence<xsd.boolean>.

  24. Updating XML Data • Reference semantics • Although more difficult to implement… • Result: in-place updates, as opposed to copy based ones. • Two types of updates are supported: • Value assignments including complex types • Tree structure updates

  25. Value Assignments • XPath expressions used as lvalues for assignment: album a = new album(...); a[|/title|] = “New Title”; • Bulk assignments: musicLibrary m = new musicLibrary(...); m[|/album[artist[1]=“The Beatles”]/stars|] = 5; • Bulk assignment advantages: • Possible optimizations  efficient updates • Clear concise code.

  26. Tree Structure Update • Methods for structural changes: • insertAfter() • insertBefore() • insertAsFirst() • insertAsLast() • Example: album currArtist = m[|/album[title=“Sounds of Silence”]/artist[1]|]; artist newArtist = new artist(<artist>Art Garfunkel</artist>); currArtist.insertAfter(newArtist);

  27. Update Issues – Tree Structure • Duplicate parents and acyclicity • After performing tree structure updates, resulting graph must remain a tree. • Example: attaching an element that already has a parent. • Problematic XJ update will result in a runtime exception. • Can be avoided by always detaching before attaching nodes.

  28. Update Issues – Complex Types • Need to validate that new value is still well typed after update. • Problem: Cannot always be done statically. • Example: • Schema states that element a can contain between 2 and 5 instances of element b. • What happens after attach() or detach()? • Solution: • Runtime check inserted at compile time.

  29. Update Issues – Covariant Subtyping • XML Schema allows declaration of subtypes by restriction. • Causes problems when updating subtype values through base class interface. • Example: xsd.integer i; stars s = m[|//stars[1]|]; i = s; i = 10; • Covariant subtyping already exists in Java arrays. • The problem would arise in any language attempting to support updates on XML Schema types. illegal value for stars element

  30. Summary – XJ Benefits • XML objects as typed values • XML Schema integration • Static type checking • Typed XPath • Compiler optimizations

  31. XJ - The Future? • Full support for Schema types • XPath expressions as independent values • Not tied to context specifier • Operators on XPath values • Composition, conjunction, disjunction… • Typed methods and fields • musicLibrary m = new musicLibrary(…);m.album[2].title = “New Title”;

More Related