300 likes | 443 Views
XML to XML through XML. Pim Lemmens Geert-Jan Houben. Eindhoven University of Technology Dept. of Computer Science. XML to XML through XML. How Adaptation of XML Transformations May Be Done by the Same Tool As Used for the Transformations Themselves. Hera Architecture.
E N D
XML to XML through XML Pim Lemmens Geert-Jan Houben Eindhoven University of Technology Dept. of Computer Science WebNet 2001
XML to XML through XML How Adaptation of XML Transformations May Be Done by the Same Tool As Used for the Transformations Themselves WebNet 2001
Hera Architecture Relational Database Object-Oriented Database XML Database … RDB-XML Wrapper ODB-XML Wrapper Mediator/ Integrator Information Retrieval User/Platform Adaptation Hypermedia Presentation Logical Presentation Logical-HTML Presentation Logical-WML Presentation Logical-SMIL Presentation HTML Presentation WML Presentation SMIL Presentation Query … WebNet 2001
XML Transformations • Document screening • Data retrieval, querying • Applying style sheets • Adaptation of those transformations to • User profiles • Platform requirements • Producer preferences • .. WebNet 2001
Transformations in HERA Query adaptation Source data Generic query Adapted query Filtered data Generic stylesheet Adapted stylesheet Stylesheet adaptation Output data WebNet 2001
Transformation Tools • XSLT • XQuery • Something different (for adaptation): HTL WebNet 2001
Single Tool for All Transformations • transformation of documents • transformation of transformation specifications • transformation specifications ~ documents • transformation specification: schema • document: instance • schema ~ instance WebNet 2001
House query (schema): <house> <price> 120,000 </price> </house> House doc (instance): … <house id=“h123”> <price> 100,000 </price> <address>..</address> </house> <house id=“h234”> … Transformation and Document WebNet 2001
Requirements for HERA Transformations • Isomorphism between schema and instance • Separate items separately accessible • Separation of input and output document specifications • One uniform mechanism for specifying structure, values and operations, independent of their use WebNet 2001
Req 1: Isomorphism Schema - Instance • Schema element/attribute name = instance element/attribute name • Schema children or attributes of element instance children or attributes of element • Schema element/attribute values instance element/attribute values • Arbitrary length list in instance represented by e.g. special kind of loop in schema WebNet 2001
Req 2: Separate Items Separately Accessible All HTL entities should be accessible to HTL: • Items to be found in input • Values to be changed • Items to be used for output • Operands to be used in calculations • DOM nodes: • Elements • Attributes • Text nodes WebNet 2001
DOM Nodes house <house year=“1980”> <price> 120,000 </price> Nice house in good condition </house> year price “Nice .. “1980” “120,000” WebNet 2001
Req 3: Separation of Input & Output Spec • Output spec not limited by input spec requirements • Input spec not limited by output spec requirements • Requires links between input and output spec • Output and input spec may use same formalism WebNet 2001
Req 4: One Mechanism for All • Output specification (structure, values, names) • Input selection (structures, values, names) • Calculation (operators, operands) • All described using templates WebNet 2001
Input template <htl:template> <house htl:id=“123”> <price/> <address/> <city/> </house> </htl:template> Output template …<htl:template> … <htl:select idref=“123”> <expenses> <htl:value-of> <price/> </htl:value-of> </expenses> ………… Input and Output Templates WebNet 2001
Requirements for HERA Transformations Recap: • Isomorphism between schema and instance • Separate items separately accessible • Separation of input and output document specifications • One mechanism for specifying structure, values and operations WebNet 2001
Why Not Use XSLT? • Well tested, efficient and widely available • Structure of specification very different from structure of documents • Paths defined using XPath: expressions hard to transform • Input driven: processing (and output) order determined by input document. WebNet 2001
Why Not Use XQuery? • Separate query and output generation parts • Output spec has XML structure • But, input spec has not an XML structure: query part needs string parsing WebNet 2001
HTL: HERA Transformation Language • Templates: used for input spec, output spec and calculation spec • Separate output spec & input spec templates • Templates may use sub-templates • Filter specification: input data to be used • Selection specification: retrieved data to be inserted in output or used for calculations WebNet 2001
Templates and Sub-templates <a> <b> .. </b> <c> <htl:param idref=“p”/> </c> <d/> </a> <htl:template htl:id=“p”> <q> <r> </htl:template> WebNet 2001
Filtering and Selection Output specification Filter specification WebNet 2001
<htl:template> <a> <b htl:id=“xyz”> <d/> <e/> <f/> </b> </a> </htl:template> Equivalent XPath expression when accessed through idref “xyz”: a/b[d e f] Templates and Path Expressions WebNet 2001
<htl:attribute name=“a”> <htl:from> <htl:operator op=“+”> <price/> 10 </htl:operator> </htl:from> </htl:attribute> XPath: [@a > (price + 10)] Templates and Calculations WebNet 2001
htl:transform htl:template htl:output htl:attribute htl:from htl:to htl:operator htl:param htl:any htl: descendant htl:select htl:name-of htl:value-of htl:copy (attribute) htl:id List of HTL Elements WebNet 2001
Implementation XSLT: • Proven technology • HTL-to-XSLT conversion by XSLT • Semantics defined in terms of XSLT • Use of XPath operators • Additional translation step required • Use of separate XSLT templates and modes not possible WebNet 2001
Conclusions • Tool for XML transformations that allows transformations on transformations • Schema isomorphic to instance • Basic entities correspond to DOM nodes • Separation of input & output specification • All expressions represented by XML structures • Implementation: XSLT WebNet 2001
Contact Information W.J.M. Lemmens Eindhoven University of Technology P.O. Box 513 NL 5600 MB Eindhoven E-mail: W.J.M.Lemmens@tue.nl Phone: (+31)(0)402473755 WebNet 2001