1 / 31

Ontology-based Integration of XML Web resources

Ontology-based Integration of XML Web resources. Irini Fundulaki CNAM-Paris, INRIA-Futurs (France) Bernd Amann, Michel Scholl CNAM-Paris, INRIA-Futurs (France) Catriel Beeri The Hebrew University, Jerusalem. The World according to XML.

arva
Download Presentation

Ontology-based Integration of XML Web resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology-based Integration of XML Web resources Irini Fundulaki CNAM-Paris, INRIA-Futurs (France) Bernd Amann, Michel Scholl CNAM-Paris, INRIA-Futurs (France) Catriel Beeri The Hebrew University, Jerusalem

  2. The World according to XML • XML isthe standard for the representation and exchange of Web data • Success of XML : “Semantic” Tags  Structured Querying • But : • “Semantic” tags are not always appropriate • Semantics is hidden in the document structure • XML DTDs can be very complex • Solution : Ontologies  Semantic Querying

  3. Outline • Problems for querying XML sources • The STYX approach for querying and integrating XML Web sources • The Ontology • Publishing XML sources • Answering Queries • Semantic Keys • Conclusions and Contributions

  4. The XML World : A simple example <!ELEMENT Film (Crew)> <!ATTLIST Film Title #CDATA #REQUIRED> <!ELEMENT Crew (Member*)> <!ELEMENT Member EMPTY> <!ATTLIST Member Name #CDATA > Film Crew Title Member ‘Intervention Divine’ Name Name Name ‘Suleiman’ ‘Yitzak’ ‘Khader’

  5. XML World : What about Semantics and Querying ? • What about querying ? • Be aware of the XML query language supported by the source • Be aware of the structure and the semantics • Where are the semantics ? • Some in the DTD : Element names and parent/child relationships • a Film element “contains” a Title and a Crew elements • Some in the XML document structure : • the first Crew element represents the film’s director • the second Crew element represents the film’s assistant director Ask God for the semantics ! (I.e. source administrator)

  6. Querying the XML World • Simple Query :«The director and assistant director of the film ‘Intervention Divine’» • Simple (?) XQuery expression : FOR $a IN document(‘URL’/Film), $b IN $a/Crew/Member[1] $c IN $b/following-sibling::*[1] WHERE $a/@Title = ‘Intervention Divine’ RETURN $b/@Name , $c/@Name

  7. From the XML World to the Semantic Web • XML only does not answer the needs of the Semantic Web • Need for richer models that precise/clarify the semantics of XML data : rich domain schemas (e.g. ontologies) • Applications of the Semantic Web: • Querying and • Data Integration

  8. The STYX approach for integrating and querying XML Web resources • IntegratingXML resources: • Integration schema (Ontology):conceptual schema with semantic keys, symmetric relationships and inheritance • XML resources are described by mapping rules between paths in the XML tree (XPath location paths) and ontologypaths • Query Mediation: • User queries are defined in terms of the ontology • Query rewriting using mapping rules • Query evaluation over multiple sources • Joining the results using semantic keys

  9. A ‘Simple’ World Assumption • Domain of interest contains: • Entities, semantic relationships between entities and properties of entities • The STYX Ontology models the domain of interest and is comprised of : • Concepts • symmetric binary roles between concepts • attributes of concepts and • inheritance relations to model commonality of structures and subset relationships between concepts

  10. took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) String Integer has title String has name took place in POLITICAL FILM Example of a (simple) STYX Ontology Concepts Inheritance Relations Roles Semantics ? No Need to ask God! Inverse Roles Attributes

  11. SELECT e,f FROM FILM a, a.has title b, b.directed_by c, c.assisted by d, c.has_name e, d.has_name f WHERE b = ‘Intervention Divine’ Return the requested values Get the film Get its title Get the director Get the assistant director Get their names Check the title Querying in STYX • Simple Query :«The director and assistant director of the film ‘Intervention Divine’»

  12. Publishing XML sources in STYX took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) has title String Integer String has name took place in POLITICAL FILM R1 : URL/Film as u1  POLITICAL FILM Film Crew Title Member Name

  13. Publishing XML sources in STYX took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) has title String Integer String has name took place in POLITICAL FILM R1 : URL/Film as u1  POLITICAL FILM Film R2 : u1/@Title as u2 has title Crew Title Member Name

  14. Publishing XML sources in STYX took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) has title String Integer String has name took place in POLITICAL FILM R1 : URL/Film as u1  POLITICAL FILM Film R2 : u1/@Title as u2 has title R3 : u1/Crew/Member[1] as u3 directed by Crew Title Member Name

  15. Publishing XML sources in STYX took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) has title String Integer String has name took place in POLITICAL FILM R1 : URL/Film as u1  POLITICAL FILM Film R2 : u1/@Title as u2 has title R3 : u1/Crew/Member[1] as u3 directed by Crew Title R4 : u3/following-sibling::*[1] as u4 assisted by Member Name

  16. Publishing XML sources in STYX took place at (place of) actor (played in) assisted by PLACE (assisted) filmed directed by EVENT FILM PERSON (directed) (filming of) has title String Integer String has name took place in POLITICAL FILM R1 : URL/Film as u1  POLITICAL FILM Film R2 : u1/@Title as u2 has title R3 : u1/Crew/Member[1] as u3 directed by Crew Title R4 : u3/following-sibling::*[1] as u4 assisted by Member R5: u3/@Name as u5 has name R6: u4/@Name as u6 has name Name

  17. Querying in STYX • Queries are simple tree queries expressed in terms of the STYX ontology • No joins, restructuring, aggregation • Query Evaluation over multiple sources • A source, returns only a subset of the possible answers for the query • To get additional answers, we must evaluate the query over all published sources • The partial results are finally processed by the mediator

  18. Querying one source in STYX • To evaluate a query over a source: • find the mapping rules that give answers to the query variables  binding variables to rules • rewrite the query into an XML query expressed in the schema of the XML source • the XML query is evaluated by the source • and the answers are returned to the STYX mediator

  19. Query Rewriting in STYX «The director and assistant director of the film ‘Intervention Divine’» R1 : URL/Film as u1 POLITICAL FILM FILM a R2 : u1/@Title as u2has title has title directed by R3 : u1/Crew/Member[1] as u3directed by b c assisted by has name R4 : u3/following-sibling::*[1] as u4assisted by R5 : u3/@Name as u5has name d e has name R6 : u4/@Name as u6has name f Variable to Rule Bindings [a R1]

  20. Query Rewriting in STYX «The director and assistant director of the film ‘Intervention Divine’» R1 : URL/Film as u1 POLITICAL FILM FILM a R2 : u1/@Title as u2has title has title directed by R3 : u1/Crew/Member[1] as u3directed by b c assisted by has name R4 : u3/following-sibling::*[1] as u4assisted by R5 : u3/@Name as u5has name d e has name R6 : u4/@Name as u6has name f Variable to Rule Bindings [a R1] [a R1, b  R2]

  21. Query Rewriting in STYX «The director and assistant director of the film ‘Intervention Divine’» R1 : URL/Film as u1 POLITICAL FILM FILM a R2 : u1/@Title as u2has title has title directed by R3 : u1/Crew/Member[1] as u3directed by b c assisted by has name R4 : u3/following-sibling::*[1] as u4assisted by R5 : u3/@Name as u5has name d e has name R6 : u4/@Name as u6has name f Variable to Rule Bindings [a R1, b  R2] [a R1, b  R2, c  R3]

  22. Query Rewriting in STYX «The director and assistant director of the film ‘Intervention Divine’» R1 : URL/Film as u1 POLITICAL FILM FILM a R2 : u1/@Title as u2has title has title directed by R3 : u1/Crew/Member[1] as u3directed by b c assisted by has name R4 : u3/following-sibling::*[1] as u4assisted by R5 : u3/@Name as u5has name d e has name R6 : u4/@Name as u6has name f Variable to Rule Bindings Full Binding [a R1, b  R2, c  R3, d  R4, e  R5, f  R6]

  23. FILM a  R1 (URL/Film ) b R2( a/@Title) a has title directed by c  R3( a/Crew/Member[1]) b c d  R4( c/following-sibling::*[1] has name assisted by e  R5( c/@Name) d e f  R6( d/@Name) has name URL/Film f a @Title Crew/Member[1] b c @Name following-sibling::*[1] d e @Name f Rewriting to XQuery expression FOR $a document(‘URL’/Film), $b IN $a/@Title, $c IN $a/Crew/Member[1] $d IN $c/following-sibling::*[1], $e IN $c/@Name, $f IN $d/@Name WHERE $b = ‘Intervention Divine’ RETURN $e, $f

  24. What about queries that cannot be answered by a source ? «The director and assistant director of the film ‘Intervention Divine’ and its year of creation ?» FILM a has title filmed.took place in directed by b g c assisted by has name d e has name f Variable to Rule Bindings Partial Binding [a R1, b  R2, c  R3, d  R4, e  R5, f  R6]

  25. Partial Bindings • To get a full answer, we need to evaluate the sub-query that the source cannot answer to the other sources and then join the partial results • To obtain this (those) sub-query (queries) we need to decompose the query into : • a prefix query that the source answers • and one or more suffix queries (sub-queries) that are possibly answered by the other sources • To join, we need keys!

  26. Semantic Keys in STYX : Ontology Revisited • XML keys • Local ID/IDREF attributes (internal pointers) • XML Schema keys are defined in terms of local element/attribute values • No formal agreement ! • Solution : define keys at the ontology level !

  27. Semantic Keys in STYX : Ontology Revisited • Semantic Keys defined in concepts of the ontology independently of any possible keys defined at the XML sources • A key for a concept is a set of attribute paths • Example : a film is identified by its title • Instances of concepts are identified by the values of the keys obtained by the mapping rules

  28. FILM a filmed.took place in has title directed by b g c assisted by has name d e FILM has name a f has title directed by b c has name assisted by d e has name PREFIX QUERY f Decomposing the query Variables to Rules Binding : [a  R1, b  R2, c  R3, d  R4, e  R5, f  R6]

  29. FILM a filmed.took place in has title directed by b g c assisted by has name d e FILM FILM has name a a f has title directed by filmed.took place in b c has name assisted by g SUFFIX QUERY d e has name PREFIX QUERY f Decomposing the query Variables to Rules Binding : [a  R1, b  R2, c  R3, d  R4, e  R5, f  R6]

  30. has title has title t t After Decomposition : Add Keys FILM FILM a a directed by has title filmed.took place in b c has name assisted by g SUFFIX QUERY PREFIX QUERY d  e has name  f The join between the prefix and the suffix queries is the join between values of variable t

  31. Conclusions and Contributions • Adding semantics to XML • Ontology = rich description of the domain of interest • Simple but powerful mapping language that associates XPathlocation paths to ontology paths • Semantic keys for XML data integration • Integration System for XML : STYX prototype • Implementation of the query rewriting and query decomposition algorithms • Web application

More Related