520 likes | 627 Views
Exploring Remote Object Coherence in XML Web Services. Robert van Engelen and Wei Zhang Florida State University. Madhusudhan Govindaraju State University of New York at Binghamton. Outline. Motivation “O/X” impedance mismatch Object-level coherence requirements
E N D
Exploring Remote Object Coherence in XML Web Services Robert van Engelen and Wei Zhang Florida State University Madhusudhan Govindaraju State University of New Yorkat Binghamton ICWS 2006
Outline • Motivation • “O/X” impedance mismatch • Object-level coherence requirements • Mapping programming language types to XML • Static-dynamic algorithms for XML (de)serialization • Results and conclusions ICWS 2006
Motivation Can object-level coherence be achieved to ensure data structures and object graphs preserve their states and structure when moved in XML (de)serialization format? ICWS 2006
O/X Impedance Mismatch(1) • Inability to support derivation by restriction in programming languages • Such as restricted value and patterns • Not being able to map XML names to identifiers in all cases • Unsupported XML schema components • Unsupported types • No naming mechanism to implement XML namespaces • Serializing object graphs ICWS 2006
C++ class: Using namespace a; Class X { Y_TYPE: y; What to be here for namespace b? … } Namespaces vs Packages • XML namespace concept can not be translated to c++ namespaces or Java packages Schema snippet: <complexType name=“X”> <sequence> <element name=“a:y” type=“a:Y_TYPE”/> <element ref=“b:x” /> </sequence> … </complexType> ICWS 2006
Serializing a graph of object requires object-level coherence to ensure application-level interoperability ICWS 2006
XML Web Services Client Server Objectgraph y Objectgraph x Objectgraph XML Serialization XML Deserialization XML Send request Receive request Server:transform yinto y’ Objectgraph y’ Objectgraph x’ Objectgraph XML Deserialization XML Serialization XML Receive response Send response ICWS 2006
Web Service Tools:Lossy Serialization in XML XML Serialization Objectgraph x Objectgraph y XML Deserialization Internal binary format X XML Serialization in XML can be lossy, because object graph x cannot be recoveredfrom its serialized XML form if we are not careful about the choice of mapping ICWS 2006
Basic Requirement:Object Graph Isomorphism Mapping M Objectgraph x Objectgraph y Inverse mapping M -1 Representation X Representation Y Object graphs x and y are isomorph if M is bijective and has an inverse M-1Object-level coherence requires bijective mappings, so that distributedand serialized object graphs are always isomorph ICWS 2006
Object Coherence Requires Lossless XML Serialization Serialization method M Objectgraph x Objectgraph y Deserialization method M -1 Internal binary format X XML Serialization is lossless if an object graph x can be recovered from itsserialized XML form y = M(x) by deserializing x = M-1(y) We consider structural equivalence only (location in memory is irrelevant)e.g. as in JavaRMI, IIOP, and DCOM ICWS 2006
X X X X Y X X Y Z Static Structure Analysis forSOAP RPC Encoding ICWS 2006
X X X X Y X X Y Z Static Structure Analysis for Document/Literal Encoding ICWS 2006
Mapping Programming language types to XML(1) • Problems with RPC encoding • Multi-ref serialization with href and id attributes violates XML schema validation constraints • Serialization of nil references, multi-referenced objects, and (sparse) multi-dimensional arrays are not precisely defined • Multiple structures may map to same XML ICWS 2006
Mapping Programming language types to XML(2) • Mapping SOAP RPC encoding style requirements • Mapping primitive types to built-in primitive XSD types and vice versa • Mapping records (structs, classes) to xs:complexType • Mapping arrays to SOAP arrays by ensuring support for SOAP 1.1 partial and sparse arrays • Object graph must be serialized with multi-reference encoding • Choosing a naming mechanism for XML namespaces resolution to avoid name clashes • prefix_name ICWS 2006
Mapping Programming language types to XML(3) • Mapping SOAP Document/Literal encoding style requirements • Mapping XML attribute definitions within an xs:complexType • Support for repetitions of xs:element in xs:complexType sequences • Support for use of xs:choice and xs:any • Support for xs:group and xs:attributeGroup • Support for substitution group • Support for top-level element, attribute definitions and references • Support for mixed contents ICWS 2006
gSOAP project: A static-dynamic approach to XML (de)serialization ICWS 2006
Static Structure Analysis forCompiled XML Serialization • Construct a data model • Compiler determines which data type instances can refer to other data type instances at run time • Generate type-specific points-to analysis algorithm • Only needed for pointer types and structs/classes with pointer fields • Generate type-specific optimized serialization code • Serializer uses id-ref linking based on pointer analysis Source codetype definitions Compiler generates generates Points-toanalysis Optimizedserializer Ptr hash table(graph edges) ICWS 2006
Constructing a Plausible Data Model from Source Code typedef int SSN;struct Node{ int val; int *ptr; float num; struct Node *next;}; val ptr num next Node SSN Data model graph(arcs denote all possiblepointer references) Source code typedefinitions ICWS 2006
Generating an XML Schema Definition for Serialized XML <simpleType name=“SSN”> <restriction base=“int”/></simpleType><complexType name=“Node”> <sequence> <element name=“val” type=“int”/> <element name=“ptr” type=“int” minOccurs=“0”/> <element name=“num” type=“float”/> <element name=“next” type=“tns:Node” minOccurs=“0”/> </sequence></complexType> val ptr num next Node SSN Data model graph ICWS 2006
Generating Code for Runtime Points-To Analysis serialize_pointerToint(int *p) { if (p != NULL) { // lookup and mark (p,TYPE_int) // as target in ptr hash table }}serialize_Node(struct Node *p){ if (p != NULL) { // lookup and mark (p,TYPE_Node) // as target in ptr hash table mark_embedded(&p->val,TYPE_int); serialize_pointerToint(p->ptr); // skip p->num serialize_pointerToNode(p->next); }} val ptr num next Node SSN Data model graph ICWS 2006
Generating Serialization Code put_pointerToint(int *p) { if (p != NULL) { // lookup (p,TYPE_int) // if embedded, then output “ref” // if single, then output value // if multi, then output value // with “id” and mark embedded }}put_Node(struct Node *p){ if (p != NULL) { // lookup (p,TYPE_Node) // if embedded, then output “ref” // if single, then output value // if multi, then output value // with “id” and mark embedded put_int(&p->val); put_pointerToint(p->ptr); put_float(&p->num) put_pointerToNode(p->next); }} val ptr num next Node SSN ICWS 2006
Serialization Example val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A Nodeloc=A Nodeloc=B =789 Ptr hash table after runtime points-to analysis by thegenerated algorithm constructed from the data model SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> </next></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> </next> </Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> </next></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> </next></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next></Node> SSNloc=C ICWS 2006
Example (cont’d) multi, id=1 single val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A multi, id=1 Nodeloc=A Nodeloc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next></Node> SSNloc=C ICWS 2006
Compiled XML Deserialization • For each data type, generate optimized deserialization code: • Generate a recursive descent LL(1) XML parser that is optimized for the data type • Embed deserialization operations as semantic actions in the parser • Invoke id hash table lookups as semantic actions to resolve id-ref references on the fly • The pointer remapper ensures the consistency of pointers in the object graph when nodes are moved in the construction process Source codetype definitions Compiler generates LL(1) parser & deserializer Pointerremapper Id hash table(resolve id-ref) ICWS 2006
Deserialization Example <Node id=“_1”></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=? ptr=? num=? next=? id=1 Node ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=? num=? next=? id=1 Node ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=? num=? next=? id=1 Node ref=2(unresolved) ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=? num=1.4 next=? id=1 Node ref=2(unresolved) ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=? num=1.4 next=B val=? ptr=? num=? next=? id=1 Node Node ref=2(unresolved) ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=B num=1.4 next=B val=456 ptr=? num=? next=? id=1 id=2 Node Node ICWS 2006
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=B num=1.4 next=B val=456 ptr=C num=? next=? id=1 id=2 Node Node =789 ICWS 2006 SSN
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=? id=1 id=2 Node Node =789 ICWS 2006 SSN
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A id=1 Node Node =789 ICWS 2006 SSN
Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next></Node> Recursive descent parsingwith object deserializationoperations as semantic actions val=123 ptr=B num=1.4 next=B val=456 ptr=C num=2.3 next=A id=1 Node Node =789 ICWS 2006 SSN
Performance Results Measured on roundtrip message containing a struct of size 1.2KB Performance results in number of roundtrip messages per second of a benchmark client/server application on various platforms. ICWS 2006
Performance Results(2) Measured on echoString With various array sizes End-to-end performance (in milliseconds) of a gSOAP benchmark application with and without multi-ref encoding. The data show that gSOAP’s multi-ref implementation adds just 2% to 3% overhead, which is minimal. ICWS 2006
Performance Comparison to Other XML Parsers Measured on echoString Of size 1KB Experimental parsers ICWS 2006
Conclusions • Object-level coherence in XML Web Service can be achieved with specialized algorithms • Design space of mapping issues is large • Algorithms to (de)serialize non-primitive data structures ensuring object-level coherence and performance guarantees • Work best with SOAP 1.2 RPC encoding • For SOAP document/literal style, we suggest use of id-ref ICWS 2006
Thank You ICWS 2006
Implementation: the gSOAP Toolkit • Web service applications are build in two stages: • Generate service definitions in familiar C/C++ header file format • Generate client/server stubs and skeleton source code and serialization code • Note: the soapcpp2 compiler can also be used to generate WSDL from header files Service defs:service.wsdl wsdl2h tool Header file defs:service.h soapcpp2compiler soapClient.cppsoapServer.cppsoapC.cpp ICWS 2006