240 likes | 378 Views
ISDSI 2009 – June 25th , 2009. Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*. Claudio Gennaro 1 , Federica Mandreoli 2,4 , Riccardo Martoglia 2 , Matteo Mordacchini 1 , Wilma Penzo 3,4 , and Simona Sassatelli 2
E N D
ISDSI 2009 –June 25th, 2009 Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS* Claudio Gennaro1, Federica Mandreoli2,4, Riccardo Martoglia2, Matteo Mordacchini1, Wilma Penzo3,4, and Simona Sassatelli2 1 ISTI – PI/CNR, Pisa 2 DII - Università degli Studi di Modena e Reggio Emilia 3DEIS - Università degli Studi di Bologna 4 IEIIT – BO/CNR, Bologna * This work is partially supported by the Italian co-founded Project NeP4B
Motivation • ICTsover the Web have become a strategic asset • Internet-based global market place where automatic cooperation and competition are allowed and enhanced NeP4B Project (Networked Peers for Business) • The aim: development of an advanced technological infrastructurefor SMEs to allow them to search for partners, exchange data and negotiate without limitations and constraints • The architecture: inspired by Peer Data Management Systems (PDMSs) ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
The Reference Scenario SELECT ?company WHERE { ?tile a Tile . ?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) && LIKE (?image, ‘myimage.jpg’, 0.3))} LIMIT 100 • A peer: a single SME or a mediator • It keeps its data in an OWL ontology • Multimedia objects multimedia attributes in the ontology (e.g. image) • It is queried by exploiting a SPARQL-like query language similarity predicates (FILTER function LIKE ) Peer1 price image company ran dom dom dom Image origin Tile dom dom dom size material ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
The Reference Scenario (cont.) • Peers are connected by means of semantic mapping (with scores) • Peers collaborate in solving users queries Q’’ Q’ • Queries are formulated on the peer’s ontology • Answers can come from any peer that is connected through a semantic path of mappings Q How to effectively and efficiently answer a query? • By adopting effective and efficient query routing techniques • Flooding is not adequate • Overloadsthe network (traffic & computational effort) • Overwelmsthe querying peer (irrelevant results) ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combining Semantic and Multimedia Query Routing Techniques • We leverage our distinct experiences on semantic [Mandreoli et al. WIDM 2006, Mandreoli et al. WISE 2007]and multimedia [Gennaro et al. DBISP2P 2007, Gennaro et al. SAC 2008]query routing • We propose to combine our approaches in order to design an innovative mechanism for a unified data retrieval • Two main aspects characterize our scenario: • the semantic heterogeneity of the peers’ ontologies • the execution of multimedia predicates • We pursue: • Effectiveness by selecting the semantically best suited subnetworks • Efficiency by promoting the network’s zones where potentially matching objects are more likely to be found, while pruning the others ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Query Answering Semantics • Peer pihas an ontology Oi ={Ci1 , … , Cim} • A semantic mapping: a fuzzy relation M(Oi,Oj) OiOjwhere each instance (C,C’) has a membership grade μ(C,C’) [0,1] • Query formula: • f ::= <triple_pattern> <filter_pattern> • <triple_pattern> ::= triple | <triple_pattern> ∧ <triple_pattern> • <filter_pattern> ::= φ | <filter_pattern> ∧ <filter_pattern> | • <filter_pattern> ∨ <filter_pattern> | (<filter_pattern>) • φ is a relational (=,<,>, <=, >=) or similarity (~t) predicate Local query execution • evaluation of fon a local data instance i: s(f,i)[0,1] • s(f,i) = s( f(φ1, …, φn), i ) = sfunφ ( s(φ1,i), … , s(φn,i) ) • Boolean semantics for relational predicates • non-Boolean semantics for similarity predicates • t-norm ( resp. t-conorm) for scoring conjunctions (resp. disjunctions) • Local query answers: Ans(f,pi) = { (i,s(f,i)) | s(f,i)>0 } f Ans(f,pi) pi Oi ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Query Answering Semantics (cont.) (Ans(f,pj), s(f,pj)) One-step query reformulation (pipj) f • according to the mapping M(Oi,Oj) • s(f,pj) = sfunc ( μ(C1 ,C’1), … , μ(Cn,C’n) ) • sfuncis a t-norm M(Oi,Oj) pi pj • f undergoes a chain of reformulations ff1…fm • s(f, Pp…pm ) = sfunr (s(f, p1), s(f1, p2), … , s(fm-1, pm)) • sfunr is a t-norm Oj Oi Multi-step query reformulation (pp1…pm =Pp1…pm ) (Ans(f,pm), s(f,Pp…pm)) f fm f2 f1 p p1 pm … Query answering semantics • fsubmitted to p • P’ = {p1,…,pm} : set of accessed peers • Pp…pi : path used to reformulate f over each pi in P’ • Ans (f, {p} U P’) = Ans(f,p) U Ans(f, Pp…p1 ) U … U Ans(f, Pp1…pm ) • where : Ans(f, Pp…pi ) = (Ans(f,pi), s(f, Pp…pi )) ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Semantic Query Routing • Whenever piforwards a query to one of its neighbors pj,the query might follow any of the semantic paths originating at pj, , i.e. in pj’ssubnetwork • A Generalized Semantic Mapping relates each each concept C in Oi to the set of concepts Csin Ojstaken from the mappings in Ppi…pjs according to an aggregated score which expresses the semantic similarity between C and Cs. • Main idea: introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworksare the most semantically related to the query. Peer2s • Preliminaries: • pjs : the set of peers in the subnetwork rooted at pj • Ojs: set of schemas {Ojk | pjk in pjs} • Ppi…pjs: set of paths from pi to any peer in pj ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Semantic Query Routing (cont.) • Each peer p maintains a matrix named Semantic Routing Index (SRI) containing the membership grades given by the generalized semantic mappings between itself and its neighborhood Nb(p) • When a peer receives a query formula f, it exploits its SRI scores to determine a ranking for its neighborhood: • Rpsem (f)[pi] = sfunc(μ(C1,C1s), …, μ(Cn,Cns)) • SRI[i,j] represents how the j-th concept is semantically approximated by the subnetwork rooted at the i-th neighbor SRI-based Query Processing ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Multimedia Query Routing • The execution of multimedia predicates is inherently costly (CPU and I/O) • Main idea:introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworks contain the highest number of potentially matching objects. • Preliminaries • Each peer’s object is classified w.r.t. its distance (dissimilarity) to some reference objects(pivots) • E.g.: • object O, pivot P, with d(O, P) = 42 0 42 100 • Bit-vector 1 0 1 0 0 0 0 0 0 0 1 0 2 1 1 0 0 0 0 0 0 0 0 0 0 • Each peer maintains a condensed description of such a classification of its objects by using histograms (Peer Indices) } ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Multimedia Query Routing (cont.) • Each peer p also maintains a Multimedia Routing Index (MRI) containing the aggregated description of the resources available in its neighbors’ subnetworks • Any MRI row MRI(p,pis)is built by summing up the Peer Indices in the i-th neighbor’s subnetwork mink [QueryIdx(Q) * MRI(p,pis)] Rpmm(f)[pi]= Σj=1…n mink [QueryIdx(Q) * MRI (p,pjs)] MRI-based Query Processing • Similarity-based Range Queries over metrics objects • For each query Q, a vector representation QueryIdx(Q) is built by setting to 1 all the intervals covered by the requested range • When a peer p receives Q, it computes the percentage of potential matching objects in each neighbor’s subnetwork w.r.t the total objects in Nb(p): ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combined Query Routing • Both the semantic and multimedia scores induce a total order • They can be combinedby means of a meaningful aggregate function in order to obtain a global ranking: Rpcomb(f) = αRpsem(f) βRpmm(f) • We inspire to • Fagin, Lotem & Naor: “Optimal Aggregation Algorithms for Middleware”. Journal of Computer and System Sciences, 66: 47-58, 2003. • Optimal aggregation algorithms can only work with monotone aggregation functions • E.g. min, mean, sum… ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combined Query Routing (cont.) E.g. SELECT ?company WHERE { ?tile a Tile . ?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) && LIKE (?image, ‘myimage.jpg’, 0.3))} LIMIT 100 • Final ranking: Peer3-Peer2 • Peer3 roots the most promising subnetwork! ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Routing Strategies • The adopted routing strategy determines the set of visited peers and induces an order on it • When a peer p receives Q it computes the ranking Rpcomb(f) on its neighbors • Different routing strategies relying on such ranking and having different performance priorities are possible: • Efficiency (Depth First Model) the best peer in one hop! • it exploits the only information provided by Rpcomb(f) • Effectiveness (GlobalModel) the best known peer! • it exploits the information provided by Up is visited Rpcomb(f)
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Experimental Evaluation Experimental setting • Simulation environment for SRI and MRI • Peers belong to different semantic categories • Ontologies: consisting of a small number of classes • Multimedia contents: hundreds of images taken from the Web, characterized by two MPEG7 standard features (scalable color & edge histogram) • Network topology: • randomly generated with the BRITE tool • in the size of few dozens of nodes • Queries: on randomlyselectedpeers • Routingstrategy: DF search • Aggregation function: mean • Stopping condition: number of retrieved results ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS • Queries: • on randomlyselectedpeers • Routingstrategy: • DF search • Aggregation function: • mean • Stopping condition: • number of retrieved results • Effectiveness evaluation • We measure the quality of the results (combined satisfaction) • Efficiency evaluation • We measure the number of performed hops
Effectiveness Evaluation • We measure the semantic quality of the obtained results (satisfaction) ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Outline • Motivation • Query Answering Semantics • Query Routing • Semantic Query Routing • Multimedia Query Routing • Combined Query Routing • Routing Strategies • Experimental Evaluation • Conclusions and Future Works ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Conclusions & Future Works • We presented a novel approach for processing queries effectively and efficiently in a distributed and heterogeneous environment, like the one of the NeP4B Project • We proposed an innovative query routing approach which exploits the semantic of the concepts in the peers’ontologies and the multimedia contents in the peers’repositories • We experimentally prove the effectiveness of our techniques with a series of exploratory tests • (In the future) we will: • perform new tests on larger and more complex scenarios • integrate our techniques in a more general framework for query routing (including latency, costs, etc.) ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS