1 / 40

WISDOM (Web Intelligent Search based on DOMain ontologies): Demo

WISDOM (Web Intelligent Search based on DOMain ontologies): Demo. Sonia Bergamaschi [1] , Paolo Bouquet [2] , Paolo Ciaccia [3] , and Paolo Merialdo [4] [1] Università degli Studi di Modena e Reggio Emilia [2] Università degli Studi di Trento [3] Università degli Studi di Bologna

Download Presentation

WISDOM (Web Intelligent Search based on DOMain ontologies): Demo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WISDOM (Web Intelligent Search based on DOMain ontologies):Demo Sonia Bergamaschi[1], Paolo Bouquet[2], Paolo Ciaccia [3], and Paolo Merialdo[4] [1] Università degli Studi di Modena e Reggio Emilia [2] Università degli Studi di Trento [3] Università degli Studi di Bologna [4] Università degli Studi Roma Tre http://www.dbgroup.unimo.it/wisdom 6 Dicembre 2006

  2. The goal of the project: Definition of a software framework that allows computer applications to leverage the huge amount of information contents offered by Web sources (typically, as Web sites) The context: • number of sources of interest might be extremely large • sources are independent and autonomous one each other These factors raise significant issues, in particular because such an information space implies heterogeneities at different levels of abstraction (format, logical, semantics). Providing effective and efficient methods for answering queries in such a scenario is the challenging task of the project Overview The WISDOM project aims at studying, developing and experimentingmethods and techniques for searching and querying data sources available on the Web.

  3. Overview In WISDOM, super-peers containing data from web-sources referred to the same domain are built. Super-peers are connected by semantic mappings in a Super-peer Network. The end-user formulates a query according to a specific super-peer. The answer will include data extracted from all the super-peers relevant for the query. From a functional point of view, the WISDOM project may be divided into two parts: A) Building a super-peer Network, where • Web sources are grouped into super-peers; • Each super-peer exports a Semantic Peer Ontology synthesizing the knowledge of the involved sources; • The Semantic peer ontologies are related by means of simple semantic mappings. B) Querying a super-peer Network, where: • A graphical interface allows the user to formulate a query according to a semantic peer ontology; • The query is rewritten for each super-peer interesting for the answer; • The query is reformulated inside each super-peer according to the involved sources • The query is locally executed and the results are provided to the user.

  4. Functional Flow-Diagram 1) Data from web-sites are extracted by means of wrappers 2) Data are annotated according to a lexical reference 3) A semantic peer ontology is created for each semantic peer 4) The semantic peer ontologies are related by means of mappings • http://dbgroup.unimo.it/wisdom/prototipi/D1.P4.html • http://dbgroup.unimo.it/wisdom/prototipi/D1.P5.html • http://dbgroup.unimo.it/wisdom/prototipi/D1.P1.html • http://dbgroup.unimo.it/wisdom/prototipi/D2.P1.html

  5. Building a Super-peer Network: the local sources We tested the system by creating with MOMIS (http://dbgroup.unimo.it/Momis), Road Runner (http://www.dia.uniroma3.it/db/roadRunner) and MELIS (http://dbgroup.unimo.it/wisdom-unimo/melis), a Super-peer Network composed of three peers, each one integrating 2-3 tourism Web sites: Peer 1 bbitaly http://www.bbitalia.it/default_eng.asp touring http://www.touring.it Peer2 guidacampeggi http://www.guidacampeggi.com saperviaggiare http://www.touringclub.com/ITA/viaggiatori/dove_mangiare venere.com http://www.venere.com Peer 3 bedandbreakfast http://www.bed-and-breakfast.it booking http://www.booking.com opificidigitali http://www.opificidigitali.it

  6. Demonstration Scenario Peer 1 abstracts in the global classes: hotels restaurants the local classes: hotels (bbitaly) restaurants (touring)

  7. Demonstration Scenario Peer 2 abstracts in the global classes: hotels campings facilities the local classes: hotels (venere) hotels (saperviaggiare) maps (venere) campings (guidacampeggi) facilities (guidacampeggi) facilities (venere)

  8. Demonstration Scenario Peer 3 abstracts in the global classes: hotels restaurants features the local classes: hotels (booking) judgement_hotel (booking) bedandbreakfast (bedandbreakfast) features (bedandbreakfast) restaurants (opificidigitali) features_bb (bedandbreakfast) conditions_hotel (booking)

  9. Wrapping Web Sources Each Super-peer is created extracting data by Road Runner: 1. Identify sources (Web sites) to be wrapped 2. For each source: infer a site schema and collect pages containing information of interest 3. From a set of sample pages, infer a wrapper library 4. Apply the wrapper library over the set(s) of pages collected in step 2 1. Identify sample pages 2 Collect pages similar in structure to those of the sample set All/InDesit: infer site schema 4 Apply the wrapper library to extract data from pages Wrapper generator 3 Generate a wrapper library Output data Wrapper library

  10. The demonstration: • 11 web sites, delivering information about hotels, campings, b&bs, restaurants • The Web site schema inference module (Indesit) was configured (when possible) to collect pages of interest from these sites • Indesit generated a Web schema for each Web site: the output description was used to collect pages about 8000 pages • 16 wrappers were inferred by means of the wrapper generation module RoadRunner • The extracted data were stored in 11 relational databases (one per source) • The Indesit Web schemas can be used to refresh data Wrapping Web Sources: the Demo

  11. Annotating Data Sources wrt a lexical reference MELIS: Meaning Elicitation and Lexical Integration System

  12. Annotating Data Sources wrt a lexical reference Meaning Elicitation Process Domain Ontology Hotel #1 For each (class and property) element in the Input Ontology, MELIS extracts all candidate senses from WordNet. After this step it filters out candidate senses by using Domain Ontologies and a collection of heuristic rules. Name #2 Input Ontology City #1 B&B #1 Building Address Name #2 Hotel Domain Ontology Name Domain Ontology Edifice #2 City Building #1 Address #3 Restaurant #2 Restaurant Motel #1 Home page Home page #1 Name #2 City #1 City #1

  13. Annotating Data Sources wrt a lexical reference OUTPUT Language: WISDOM-OWL <owl:Class rdf:ID="BB.bed_and_breakfast"> <rdfs:label> bed_and_breakfast </rdfs:label> <db:PrimaryKey> BB.bed_and_breakfast.url </db:PrimaryKey> <lex:wnAnnotation rdf:parseType="Literal"> <lex:lemmaValue>bed_and_breakfast</lex:lemmaValue> <lex:lemmaSyntacticCategory>1</lex:lemmaSyntacticCategory> <lex:lemmaSenseNumber>1</lex:lemmaSenseNumber> </lex:wnAnnotation> </owl:Class> OWL DL DB Annotations Lexical Annotations

  14. Building a Super-Peer Ontology Super-peer Ontologies were built by means of the MOMIS system, extended for the specific purposes of the project. In particular, techniques for adding/removing sources to/from a created ontology without restarting the process from scratch were introduced. The MOMIS process for building a domain ontology is based on the following steps:

  15. Building Peer 2 Peer 2 was created integrating the local sources venere (493 hotels), saperviaggiare (977 hotels) and guidacampeggi (183 campings) Source venere local classes: hotels maps facilities

  16. Building Peer 2 Source guidacampeggi local classes: campings facilities Source saperviaggiare local class: hotels

  17. Peer 2 Ontology

  18. Creating inter-peer mappings Booking For each node of the Peers that are compatible with other elements of other Peers we create a Mapping Element that describe the relationship between them PEER 3 Hotel #1 Hotel #1 Name #2 Address #3 BBItaly URL #1 Hotel #1 Venere PEER 1 Hotel #1 Hotel #1 PEER 2 Name #2 Hotels #1 Address #3 Name #2 Phone Number #1 City #1 City #1 e-mail #1 URL #1 Logo #1 Address #3 Price #4

  19. Creating inter-peer mappings <SourceElement> <SourceElementID>Peer1.hotels.address </SourceElementID> <SourceElementLabel>address </SourceElementLabel> <AtomicMeaning>address#9#0#C </AtomicMeaning> <ContextualMeaning>address#9#0#C </ContextualMeaning> <DictionaryID>wordnet21</DictionaryID> <Senses> <Sense>107938889</Sense> </Senses> </SourceElement> OUTPUT Mappings: <MappingElement> <MappingElementID>mappingElement#3</MappingElementID> <MappingType>Datatype2Datatype</MappingType> <SourceElement> </SourceElement> <TargetElement> </TargetElement> <Relations> <SemanticRelation> <RelationType>equivalent</RelationType> <RelationGrade>1</RelationGrade> </SemanticRelation> <RelationMeasure> </RelationMeasure> </Relations> </MappingElement> <RelationalMeasure> <LessGeneralThan>0.0</LessGeneralThan> <MoreGeneralThan>0.0</MoreGeneralThan> <Equivalent> <SameGranularity>1.0</SameGranularity> <LowerGranularityThan>0.0</LowerGranularityThan> <HigherGranularityThan>0.0</HigherGranularityThan> </Equivalent> <Disjoint>0.0</Disjoint> <Overlapping>0.0</Overlapping> </RelationalMeasure>

  20. Querying a Super-Peer Network Querying the Super-Peer Network involves: • Formulating the query at a peer on the ontology local to that peer • Rewriting the query according to neighboring peers’ ontologies using the semantic mappings • Selecting the peers that are more relevant to the query (using content summaries and semantic information about the rewritings) • Sending the rewritten query to the relevant neighboring peers • Translating the query to execute it on the local sources • This involves relaxing preference expressions that are not directly manageable by the underlying query executor • Executing the query locally (by translating it using the local mappings) and sending the results to the local query processor • Collecting the results from both the local sources and the neighboring peers • Building the final result • This may involve performing additional computations to enforce the original preference relation that was relaxed to be performed locally • Presenting the result to the user

  21. Functional Flow-Diagram At peer p0, the user formulates a request using the M-FIRE interface, that produces a query Q in a SPARQL-like syntax. The semantic parser translates the query in an internal format to be easily manipulated by the query processor. The QP rewrites the query with respect to the local ontology (to send the query to the MOMIS Query Manager) or to a remote ontology (to be sent to the QP of a neighboring peer). GVV0 GUI (M-FIRE) L0 Q q0 (MOMIS Query Manager) Semantic Parser q0 q (QP of)peer pj Query Processor qj

  22. The user interface prototype The query formulation prototype implements the M-FIRE framework and includes two components: a client component for visual rendering and user interaction handling, and a server component implementing the M-FIRE representation and navigation engines. Query Formulation with M-FIRE Metaphors M-FIRE allows to declaratevely define how a give RDF document shall be graphically represented by supplying metaphors as parameters to a generic representation engine. Metaphors also determine how the user’s actions on the delivered representation shall be translated into queries over the underlying knowledge base. Our prototype includes two metaphors, featuring: • Two alternative ways of representing the ontology schema • One single way of formulating conjunctive queries on the ontology schema • One single table-like view of the query results

  23. Query Formulation with M-FIRE Ontology schema representation After selecting a knowledge base to be explored and a metaphor for its presentation, the user is provided with a representation of the ontology schema. This is how the two alternative metaphors represent a schema: • Classes are rendered as tables, where a left pane is showing an intuitive icon for the class, and the right pane is listing the set of properties which apply to that class. • Each datatype property has a light yellow background and a black font • Each object property has a yellow background and a dark red font; moreover, on the left of the property name, an icon is shown for each class which is in the range of the property • Classes are rendered as tables, where the upper pane is showing an intuitive icon for the class on the left of the class name, and the lower pane is listing the properties which apply to that class • Each datatype property has a cyan background, is written in italic and has a black font; on the right side, the name of each class in the property range is shown • Each object property has a light cyan background and a blue font; moreover, on the right side of the property name, the name of each class in the property range is shown

  24. Query Formulation with M-FIRE Query formulation Conjunctive queries are formulated by left-clicking or right-clicking on the properties to be included in the result (projection) or for which filters are to be specified (selection) • A left click on a datatype property selects that property for output (object property cannot be selected for output) • A left click on an object property means that the clicked property will be used to perform a join between two classes (each class can only participate in one join with each other class) • A right click on a datatype property allows to express a filter on that property (the operator for comparison and the target value may be specified through a proper dialog box)

  25. Input to the Query Processor Query with joins are formulated similarly to queries without joins, by specifying the correct path between join properties through mouse clicks. The query is finally passed to the local query processor.

  26. rdfs:Datatype type type type type type type weekDay cuisineType xsd:boolean xsd:string xsd:integer guides ran ran ran type phone name price ran ran ran dom dom dom ran type dom cuisine Restaurant rates rdf:Property ran type dom dom type dom type dom rdfs:Class Rating guide closing - day dom ran has - smoking - area value Query Processor Components Internally, the QP ranks the rewritten queries in order to only execute the “best” ones. Then, the optimizer produces the actual queries that will be sent to the local executor (e.g., by “relaxing” some preferences) or to neighboring peers Ont0 Q (SPARQL-like) Semantic Parser q (internal form) Optimizer Rewriter rq1 rq1 Ranker rqM rqk Plan Treej SemanticMappings Peers’Metadata qj,N ContentSummaries qj,i qj,0

  27. Inter-Peer Query Rewriting Mapping extension with scores RelationGrade: measures the similarity among the corresponding elements

  28. Inter-Peer Query Rewriting Target: Superpeer3 Query reformulation example BASE <http://www.wisdom.net/ontology#> SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_single ?P ; Peer3.hotels.url ?HURL . Peer3.features Peer2.features.url ?SURL . FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50 Source:Superpeer2 R e w r i t e r BASE <http://www.wisdom.net/ontology#> SELECT ?N ?A ?C ?P ?S FROM Peer2 WHERE {Peer2.hotels Peer2.hotels.name ?N ; Peer2.hotels.address ?A ; Peer2.hotels.city ?C ; Peer2.hotels.price ?P ; Peer2.hotels.url ?HURL . Peer2.services Peer2.services.faciltity ?S ; Peer2.services.url ?SURL . FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80) && (?S = 'air conditioning')). } PREFERRING min(?P) LIMIT 50 UNION BASE <http://www.wisdom.net/ontology#> SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_double ?P ; Peer3.hotels.url ?HURL . Peer3.features Peer2.features.url ?SURL . FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50 UNION BASE <http://www.wisdom.net/ontology#> SELECT ?N ?P ?S FROM Peer3 WHERE {Peer3.hotels Peer3.hotels.name ?N ; Peer3.hotels.price_triple ?P ; Peer3.hotels.url ?HURL . Peer3.features Peer2.features.url ?SURL . FILTER ((?HURL=?SURL) && (?HURL='Rimini') && (?P>50) && (?P<80)). } PREFERRING min(?P) LIMIT 50

  29. Target schema Global union score 1st rewritten query 1st query score & percentages 1st query terms rewriting details The WISDOM Project Query reformulation: details of target output

  30. Inter-Peer Query Rewriting The Ranker component of the query processor ranks all the available rewritings (according to user preferences) in order to only execute the “best” ones (e.g., in order to maximize the number of retrieved results, or the semantic similarity of the rewritten query wrt the original one)

  31. Inter-Peer Query Forwarding The QP of the neighboring peer receives the query, solves it locally and (possibly) forwards it to its neighbors (care is taken to ensure that each peer only receives a query once).

  32. Query Reformulation for Local Execution The Optimizer component of the QP translates the rewritten query in SQL (e.g., preference expressions are relaxed into ORDER BY clauses). Finally, the Executor sends the query to the local query executor (MOMIS Query Manager), waiting for the results.

  33. Local Query Execution The MOMIS Query Manager reformulates the query taking into account the intra-peer mappings defined in a semantic peer among the local classes and the global classes of the GVV (Global Virtual View). The mappings are defined by using a GAV (Global as View) approach: each global class of the GVV is expressed by means of the full-disjunction operator over the local classes. • Query rewriting • GAV approach: the query is processed by means of unfolding • Fusion and Reconciliationof the local answers into the global answer • Object Identification : Join conditions among local classes • Inconsistencies:Resolution functions to deal with conflits

  34. query q0 = scqG1  scqG2 single class query scqG1 single class query scqG2 hotels hotels Hotels Services Services Hotels hotels hotels map_hotels facilities facilities L1scqG1 L2scqG1 L3scqG1 L1scqG2 L2scqG2 facilities facilities map_hotels Query rewriting and execution Global Virtual View(GVV) Local Schema Local Schema Local Schema Query execution on the local sources SAPERVIAGGIARE GUIDACAMPEGGI VENERE

  35. q0 = SELECT H.name, H.address, H.city, H.price, S.facility, S.structure_name, S.structure_city FROM hotels as H, services as S WHERE H.city = S.structure_city and H.name = S.structure_name and H.city = 'rimini‘ and H.price > 50 and H.price < 80 and S.facility = 'air conditioning' order by H.price scqG1 = SINGLE CLASS QUERIES SELECT H.name , H.address , H.city , H.price FROM hotels as H WHERE (H.city = 'rimini' ) and (H.price > 50) and (H.price < 80) scqG2 = SELECT S.facility , S.structure_name , S.structure_city FROM services as S WHERE (S.facility = 'air conditioning') L1scqG1 = SELECT hotels.name, hotels.address, hotels.city FROM hotels WHERE (city) = ('rimini') UNFOLDING L2scqG1 = SELECT maps_hotels.hotels_name2, maps_hotels.hotels_city FROM maps_hotels WHERE (hotels_city) = ('rimini') L3scqG1 = SELECT hotels.name2, hotels.address, hotels.price, hotels.city FROM hotels WHERE ((city) = ('rimini') and ((price) > (50) and (price) < (80))) L1scqG2 = SELECT facilities_hotels.hotel_name2, facilities_hotels.hotels_city, facilities_hotels.facility FROM facilities_hotels WHERE (facility) = ('air conditioning') L2scqG2 = SELECT facilities_campings.campings_name, facilities_campings.campings_city, facilities_campings.name FROM facilities_campings WHERE (name) = ('air conditioning') Query rewriting UNFOLDING

  36. q0 result set scqG1 joinscqG2 scqG1 result set scqG2 result set L1scqG1 full join L2scqG1 full join L3scqG1 L1scqG2 full joinL2scqG2 L1scqG1 result set L2scqG1 result set L3scqG1 result set L1scqG2 result set L2scqG2 result set partial results partial results partial results Fusion and Reconciliation SAPERVIAGGIARE GUIDACAMPEGGI VENERE

  37. saperviaggiare.hotels full outer join venereEn.hotels on ( ((venereEn.hotels.name2) = (saperviaggiare.hotels.name) AND (venereEn.hotels.city) = (saperviaggiare.hotels.city))) full outer join venereEn.maps_hotels on ( ((venereEn.maps_hotels.hotels_name2) = (saperviaggiare.hotels.name) AND (venereEn.maps_hotels.hotels_city) = (saperviaggiare.hotels.city)) OR ((venereEn.maps_hotels.hotels_name2) = (venereEn.hotels.name2) AND (venereEn.maps_hotels.hotels_city) = (venereEn.hotels.city))) guidacampeggi.facilities full outer join venere.facilities on ( (venere.facilities.facility) = (guidacampeggi.facilities.name) AND (venere.facilities.hotels_city) = (guidacampeggi.facilities.campings_city) AND (venere.facilities.hotel_name2) = (guidacampeggi.facilities.campings_name)) SELECT H.name , H.address , H.city , H.price , S.facility , S.structure_name , S.structure_city FROM hotels as H , facilities as S WHERE (H.city = S.structure_city ) AND (H.name = S.structure_name ) ORDER BY H.price ASC Fusion and Reconciliation scqG1 result set = L1scqG1full joinL2scqG1 full joinL3scqG1 scqG2result set= L1scqG2full joinL2scqG2 q0 = scqG1 result setjoinscqG2 result set

  38. Local Query Execution The MOMIS Query Manager at work

  39. Building the Final Result Local results are forwarded by MOMIS to the query processor. The Executor component also retrieves results from neighboring peers, computes the overall result by taking into account the original user preferences, and forwards it to the M-fire interface.

  40. Showing Results in M-FIRE Results are finally shown in M-fire using a table-based form: • Solutions are listed vertically, each one with its own table • For each solution, variable bindings are listed vertically • For each binding a row is provided, where the property name corresponding to the binding variable is shown on the right side, and the (literal) value is shown on the left side

More Related