1 / 29

XJ: Facilitating XML Processing in Java

XJ: Facilitating XML Processing in Java. Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski Vivek Sarke

bikita
Download Presentation

XJ: Facilitating XML Processing in Java

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski Vivek Sarke Conference:The 14th International World Wide Web Conference (WWW2005), Chiba, Japan, May 10-14, 2005 • Karawan Shahla Seminar Lecture 236803

  2. Agenda • Some files. • Main Idea. • Introduction to XJ. • XJ Type System. • XJ Expressions . • XJ Updates. • XJ Problems. • Conclusion

  3. Schema file(file: technioncatalog.xsd) <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="catalog"> <xs:complexType> <xs:sequence> <xs:element name="course" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="points" type="xs:int"/> <xs:element name="number" type="xs:int"/> <xs:element name="name" type="xs:string"/> <xs:element name="teacher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

  4. XML document(file: short.xml) <?xml version="1.0" encoding="UTF-8"?> <catalog> <course> <points>3</points> <number>234319</number> <name>Programming Languages</name> <teacher>Ron Pinter</teacher> </course> <course> <points>3</points> <number>234141</number> <name>Combinatorics for CS</name> <teacher>Ran El-Yaniv</teacher> </course> </catalog>

  5. XJ Program file import java.io.*;import technioncatalog.*; public class Demo1 { public static void main(String[] args) throws Throwable { catalog cat =new catalog(new(File("short.xml")); catalog.course c =cat [| /course[2] |]; printCourse(c); } private static void printCourse(catalog.course c) { String name = c [| /name |]; String teacher = c [| /teacher |]; int points = c [| /points |]; int id = c [| /number |]; System.out.println(name + "(" + id + ") by " + teacher + ", " + points); } } “Combinatorics for CS (234141) by Ran El-Yaniv, 3 credit points”

  6. Main Idea • XML is getting increasingly popular. • High level languages should support manipulating XML sufficiently. • Let’s go through existing API’s

  7. Traditional XML processing:(DOM, XPath apis) public static void main(String[] args) throws Throwable { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new java.io.File("short.xml")); XPath xp = XPathFactory.newInstance().newXPath(); DTMNodeList nodes = (DTMNodeList) xp.evaluate("//course", doc, XPathConstants.NODESET); printCourse(nodes.item(1)); } The types of the XML objects (Node, Document) do not reflect the schema • XPath is a plain string. It may be: • Syntactically incorrect • Incompatible with the document

  8. Traditional XML processing(DOM apis) Assumption: 3rd child is the course number Assumption: 2nd child has no child elements • These assumptions will not hold if the schema is changed • => run-time errors • problems remain, even if we identify nodes by name • Possible Schema changes: • Allowing a new optional <students> sub-element • Changing the order of the sub-elements private static void printCourse(Node n) { NodeList nodes = n.getChildNodes(); System.out.println(nodes.item(5).getTextContent() + " (" + nodes.item(3).getTextContent() + ") by " + nodes.item(7).getTextContent() + ", " + nodes.item(1).getTextContent() + " credit points"); } Assumption: Four child nodes must exist What about reading the numeric value of an element?

  9. Shaping the future • What XML-related facilities do we want? • Typed XML objects • Seamless translation of a Schema/DTD into a Java type • Two composition techniques • XML notation • Java’s object creation syntax • Two decomposition techniques • Typed XPath • Typed, named methods/fields • XPath expressions as first-class-values

  10. XJ: offered solution • Java  XJ. • we will over view the constructs offered by XJ. • Available at:http://www.research.ibm.com/xj

  11. XJ Type System

  12. Integration with Schema • The rationale: • An OO program is a collection of class definitions • A Schema file is a collection of type definitions • => let’s integrate these definitions • Any Schema is also an XJ types • The XJ compiler generates a “logical class” for each such type • Schema file == package name • Using a schema == import schema_file_name;

  13. XML literal in XJ code • Invalid XML content triggers a compile-time error • Resulting elements are typed! import technioncatalog.*; public class Demo2 { public static void main(String[] args) throws Throwable { String x = "Algorithms 1"; int y = 234247; catalog cat = buildCatalog(new catalog.course( <course><points>3</points> <number>{y}</number><name>{x}</name> <teacher>Shlomo Moran</teacher></course>)); } private static catalog buildCatalog(catalog.course c) { return new catalog(<catalog>{c}</catalog>); } }

  14. An ill-typed program ... course c = new course(<course> <teacher>Shlomo Moran</teacher></course>); buildCatalog(c); XMLObject x = new course.teacher ( <teacher>Shlomo Moran</teacher>); buildCatalog(x); ... private static catalog buildCatalog(catalog.course c) { return new catalog(<catalog>{c}</catalog>); } Wrong <course> element An XMLObject cannot be passed as a course element

  15. Embedding XPath Queries in XJ • Syntax: XmlExpr[| XPathQuery |] Requires: a context-provider: • An XML element over which the XPath query is invoked • (see the cat variable in the sample) course doSomething(catalog cat, int courseNum) { return cat [| /course[./number = $courseNum] |]; }

  16. XPath Semantics • Problem: resulting type is sometimes not so clear • Two options • Sequence<T> • If the compiler determines that all result elements are of type T • Sequence<XMLObject> • (Otherwise) • Automatic conversion from a singleton sequence • Static check of XPath queries • If result is always empty => compile-time error

  17. XJ Updates (Introduction) • XJ provide three kinds of updates: 1) Simple assignment. 2) Bulk assignment. 3) Structural updates. • XJ updates are chosen to be consistent with Java’s reference semantics.

  18. XJ Updates (syntax and semantics) • Simple Assignment The XPath expression returns a reference to the existing element to be updated. • Bulk Assignment The XPath expression denotes a sequence , bulk assignment allows multiple assignments. Here double the credit points of each course. public static void changePoint(catalog.course c, int p) { c [| /points |] = p; } public static void changePoint(catalog.course c, int p) { cat [| //points |] *:= 2; }

  19. XJ Updates (syntax and semantics) public static void addCourse(catalog cat) { course c = new course(<course><points>4</points> <number>234111</number><name>Intorduction to CS</name> <teacher>Roy Friedman</teacher></course>); cat.insertAsLast(c); } • Structural updates • Class XML Object also defines methods, such as: • insertAfter() • insertBefore() • insertAsFirst() • insertAsLast() • detach()

  20. XJ Updates Problems : Cycles • Updates may cause cycles, e.g. a class that have more than one parent. • This arises a run time exception. • Ensuring that the root is never inserted into one of it’s descendants. Why cycles are bad ? Can you think of a solution ?

  21. XJ Updates Problems : Type Consistency • Definitions • An XML update operation, u, is a mapping over XML values • u: T1 -> T2 • An update is consistent if T1 = T2 • Ideally, a compile-time error should be triggered for each inconsistent update in the program • Unfortunately, this cannot be promised • The solution: Additional run-time check Can you think of an example ?

  22. XJ Updates Problems:Covariant subtyping(the problem) A1.m() is “spoiled”: Requires only X1 objects • Covariance: change of type in signature is in the same direction as that of the inheritance class X { } class A { public void m(X x) { } } Class X1 extends X { } Class A1 extends A { public void m(X1 x) { } } ... A a = new A1(); a.m(new X()); Which method should be invoked: A.m() or A1.m() ? • Java favors type-safety: A method with covariant arguments is considered to be an overloading rather than overriding • Same approach is taken by C++, C# • But, covariance is allowed for arrays • Array assignments may fail at run-time

  23. XJ Updates Problems:Covariant subtyping (example) (Now let us get back to our technioncatalog schema…) • A <course> value is also spoiled • It requires unique children: <points>, <name>, etc. • But, it also has an unspoiled super-class: XMLObject • All updates to XMLObject are legal at compile-time • The following code compiles successfully: public static void trick(course c) { XMLObject x = c; points p = new points(<points>4</points>); x.appendAsLast(p); } Run-time error is here !!

  24. Shaping the future (revisited) • Language constructs seen so far • Typed XML objects • Seamless translation of a Schema/DTD into a Java type • Two composition techniques • XML notation • Java’s object creation syntax • Two decomposition techniques • Typed XPath • Typed, named methods/fields • XPath expressions as first-class-values

  25. XPath expression as first-class-values • What is a first-class-value? • A value that can be used “naturally” in the program • Passed as an argument • Stored in a variable/field • Returned from a method • Created • In XJ, XPath expression do not met these conditions • The main obstacle: The XPath part of the expression cannot be separated from its context provider

  26. XPath expression as first-class-values • Operators on XPath values • Composition • Conjunction • Disjunction • These operators will allow the developer to easily create a rich array of safe XPath values • The compiler must keep track of the type of each such value • Basically an XPath value is a function T -> R, where both T,R are subclasses of XMLObject • When two XPath values are composed, the result type is deduced from the types of the operands

  27. Typed, named methods/fields • Usually, values aggregated by a Java object are accessed by fields/methods • Can we access XML sub-elements this way? • (Following code IS NOT a legal XJ program) import technioncatalog.*; void printTeachers(catalog cat) { for(int i = 0; i < cat.courses.length; ++i) { catalog.course c = cat.courses[i]; System.out.println(c.teacher); } }

  28. Typed, named methods/fields • Some of the difficulties: • Sub-elements are not always named • Schema supports optional types: <xsd:choice> • How can Java express an “optional” field? • Observation: Java’s typing mechanisms cannot capture the wealth of Schema/DTD types • Missing features: virtual fields, inheritance without polymorphism • Other features can be found in Functional languages • E.g.: Variant types, immutability, structural conformance • But, their popularity lags behind

  29. Conclusion • XJ is a Java extension that has built in support for XML • Type safety: Many things are checked at compile time • Ease of use • OO languages are not powerful enough (in terms of typing) • Some type information is lost in the transition Schema -> Java

More Related