280 likes | 405 Views
Exchanging Intensional XML Data. Tova Milo, INRIA & Tel Aviv University Serge Abiteboul, INRIA & Xyleme S.A. Bernd Amann, CNAM Omar Benjelloun , INRIA Frederic Dang Ngoc, INRIA. SIGMOD 2003 – San Diego. Introduction. Intensional documents. Early days of the web
E N D
Exchanging Intensional XML Data Tova Milo, INRIA & Tel Aviv University Serge Abiteboul, INRIA & Xyleme S.A. Bernd Amann, CNAM Omar Benjelloun, INRIA Frederic Dang Ngoc, INRIA SIGMOD 2003 – San Diego
Introduction Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Intensional documents • Early days of the web • Extensional data (static HTML) • CGI scripts (perl, …) Code is executed to generate data. • “Intensional” data • HTML with embedded code (php, jsp, …) Embedded code is executed before sending data. • XML with embedded calls to Web services Calls are still evaluated before sending (Jelly, MX, …). • Active XML Calls do not have to be evaluated before sending data. • Advantages of intensional data • More information: it shows how data is generated • Dynamic: it provide the means, e.g. to refresh data • Control the exchange of intensional data(to call or not to call?). Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Web services in a nutshell • A number of standards • XML, SOAP, WSDL,UDDI, … • Means to provide, invoke and describe remote functions with XML input/output. • They make intensional documents exchangeable. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Context: Active XML (AXML) • A language: XML with embedded service calls • A peer-to-peer system • Each peer • Repository of intensional (AXML) documents • Server: provides Web services (XQuery) • Client: when invoking the embedded service calls • And many more cool features • distribution and replication • continuous services • etc. • AXML peers exchange intensional data. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Outline • Introduction • Intensional data • Schema-controlled exchange of intensional data • Safe rewriting algorithm • Conclusion Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Intensional data Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
City city temp exhibits date title newspaper GetEvents GetTemp “Exhibits” “Paris” GetExhibits “16°C” “Paris” T! Y! Materialization <?xml version=“1.0” ?> <newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp”> <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper> “06/10/2003” <temp>16°C</temp> “Le Monde” <exhibits> <call svc=“Yahoo.GetExhibits”> <city>Paris</city> </call> </exhibits> • Materialization: replacing a service call by its result. • It’s a recursive process. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
title date newspaper title date city newspaper city temp temp GetEvents GetEvents GetTemp GetTemp “Exhibits” “Exhibits” “06/10/2003” “06/10/2003” “Le Monde” “Le Monde” “Paris” “Paris” “16°C” “16°C” Y! To call or not to call ? • Materialization can be performed • by the sender, before sending a document… • or by the receiver, afterreceiving it. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Why control the materialization of calls? • For added functionality, e.g. • Intensional data allows to get up-to-date information. • For security reasons or capabilities, e.g. • I don’t trust this Web service/domain, • I don’t have the right credentials to invoke it, • It costs money, • Maybe the receiver doesn’t know Active XML! • For performance reasons, e.g. • A proxy can invoke all the services on behalf of a PDA. • … and many more reasons you can think of! Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Sender Receiver Capabilities ACL Cost ... Capabilities ACL Cost ... How to control it? Using types • We extend XML Schema, with intensional types: XMLSchemaint g Data exchange schema q f g f q ... ... g g g q f r ... r g f ... q g g q ... r ... ... ... ... • Static analysis algos use signatures of services:WSDLint Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Schema-controlled exchange Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
newspaper title date city GetEvents GetTemp “Exhibits” “Paris” “06/10/2003” “Le Monde” The extended schema language To simplify, we use here a DTD-like syntax • Data newspaper = title.date.(GetTemp|temp).(GetEvents|exhibit*) title = data date = data temp = data city = data exhibit = title.(GetDate|date) • Functions GetTemp(city) -> temp GetEvents(data) -> (exhibit|performance)* GetDate(title) -> date • Rewriting: replace call(s) by an arbitrary output of the service. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Rewritings • The Goal Given • an intensional document d • a schema s, Can we rewrited so that it matches s? • Safe rewriting: one that for sure leads to s (we know without making any call). • Possible rewriting: one that possibly leads to s (depending on the answer of the service). Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Difficulties • Infinite search space • Vertical • Horizontal • Main problem • The result of a Web service call is unknown, • We just know a signature (input/output types) • We want a very efficient solution. • Foundations of the problem • tree automata, • with existential and universal transitions. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Results • Restrictions on the considered rewritings • Left-to-right: No “going back and forth” • K-depth: bound on the nesting of function calls (Search space still infinite but finitely representable) • Under these restrictions • We have algorithms to find safe/possible rewritings. • They are PTIME (for deterministic schemas). • We can also do it between schemas. • Recent follow-up work by [MSS03] • The general problem is undecidable. • Some complexity results. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Safe rewriting algorithm Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Safe rewriting algorithm • Sketch • Deal with function parameters, • Traverse the tree top-down, • For each data node, rewrite its children. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Rewriting the input parameters of calls • To invoke a service, the parameters must match its signature. • Start from the deepest calls • Move recursively upward • Finish by rewriting the document. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Safe rewriting algorithm • Sketch • Deal with function parameters, • Traverse the tree top-down, • For each data node, rewrite its children. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Rewriting a node’s children • We have • The children: title.date.GetTemp.GetEvents • The type to match: title.date.temp.(GetEvents|exhibit*) • Output types of services: • GetTemp -> temp • GetEvents -> (exhibit | performance)* • Three steps • Build an FSA that accepts all k-depth rewritings of the word. • Build an FSA that recognizes the complement of the type. • Compute their intersection to find a safe rewriting. • Smarter algo in the system: lazy automata construction. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
title q5 q6 q7 q1 q4 q0 Rewriting a node’s children • accepts all k-depth rewritings of the word. • This is for title.date.GetTemp.GetEvents • Output types of services GetTemp -> temp GetEvents -> (exhibit | performance)* date GetTemp GetEvents q2 q3 temp exhibit performance Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
title date temp p5 p4 p3 p1 p2 p0 Rewriting a node’s children (2) • is the complement automaton for the target type. Newspaper = title.date.temp.(GetEvents|exhibit*) * * * * GetEvents * p6 * * exhibit exhibit Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Rewriting a node’s children (3) exhibit q4,p6 q7,p5 q4,p5 performance performance exhibit GetEvents exhibit performance q7,p6 q3,p6 q7,p3 q4,p3 q7,p6 GetTemp title date GetEvents q1,p1 q2,p2 q3,p3 q4,p4 q0,p0 A safe rewriting exists! title.date.GetTemp.GetEvents title.date.temp.GetEvents temp q5,p2 q6,p3 Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Other algorithms (in the paper) • Possible rewriting • Schema compatibility • Verifies that all instances of a schema safely rewrite to instances of another schema. • Key idea: It is sufficient to check a finite number of instance representatives. Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Conclusion • Schema-controlled exchange of intensional data • Implemented as part of the Active XML system • Fun applications • Easy customization of Web services (VLDB’03 demo) Types form the basis to match client preferences • Surveillance of an AXML application (call tracing) • Perspectives • Extend with automatic data conversion • Further optimize the algorithm (notably, for simple cases) Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Shameless advertisement Shameless advertisement • Active XML • a language and peer-to-peer system based on XML with embedded calls to Web services • VLDB’02 demo • SIGMOD session, tomorrow morning • http://www-rocq.inria.fr/verso/Gemo/Projects/axml (or google://ActiveXML) • Lots of cool applications • mobile computing, network configuration, warehouse of web resources… Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003
Merci Omar Benjelloun - Exchanging AXML Data - SIGMOD 2003