1 / 43

IST-2001-34825

IST-2001-34825. Technique for query answering in the context of one Brokering Agent Domenico Beneventano. Summary. The Mechanical scenario Brokering Agent (BA) Ontology, SINode Ontologies, Data Source Schemata Query Management How to write a query? How to answer a query?

tavon
Download Presentation

IST-2001-34825

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IST-2001-34825 Technique for query answering in the context of one Brokering Agent Domenico Beneventano

  2. Summary • The Mechanical scenario • Brokering Agent (BA) Ontology, SINode Ontologies, Data Source Schemata • Query Management • How to write a query? • How to answer a query? • Final release of the protoype for Query Management in the context of one Brokering Agent

  3. The Mechanical Scenario Brokering Agent GVV Mapping m1 SINode GVVs Mapping m2 Source Schemata

  4. BA and SINode Ontologies: example of mappings BA GVV Mapping Table of Company (mapping m2) SINode SN2 SINode SN1 Mapping Table of SN2.Company (mapping m1) Source Schemata S1.aziende(ID,INDIRIZZO, ... ) S2.Company(COMPANY_ID, REGION, …) S3.Company(COMPANY_ID, ADDRESS, …) • Source S1 : TUTTOSTAMPI Source S2: DEFORMAZIONE Source S3: SUBFORN

  5. END USER QUERY TOOL The Query Management BROKERING AGENT QUERY AGENT PLAY MAKER EXPANDER BAOntology Give me the subcontracting companies in Veneto with a big capital stock in the plastic and rubber sector Librarian UNFOLDER SINodeAgent2 SEWASIE_DB SINodeAgent1

  6. End-User Query Tool • The query interface is meant to support a user in formulating a precise query – which best captures her/his information needs – even in the case of complete ignorance of the vocabulary of the underlying information system holding the data • The final purpose of the tool is to generate a conjunctive queryready to be executed by the evaluation engine associated to the information system

  7. The role of the Ontology for the End-User • The intelligence of the interface is driven by an ontology describing the domain of the data in the information system • The ontology defines a vocabulary which is richer than the logical schema of the underlying data, and it is meant to be closer to the user’s rich vocabulary • The user can exploit the ontology’s vocabulary to formulate the query, and she/he is guided by such a richer vocabulary in order to understand how to express her/his information needs more precisely

  8. Intentional Navigation • It helps an unskilled user during query formulation, by overcoming problems related with the lack of schema comprehension • Queries can be specified through an iterative refinement process supported by the ontology • Users may specify their requests using generic terms, refine some terms of the query or introduce new terms, and iterate the process • Users explore and discover general information about the domain, by getting an explicit meaning to a query and to its subparts through classification

  9. Query END USER QUERY TOOL The Query Management BROKERING AGENT QUERY AGENT PLAY MAKER EXPANDER BAOntology Give me the subcontracting companies in Veneto with a big capital stock in the plastic and rubber sector Librarian UNFOLDER SINodeAgent2 SEWASIE_DB SINodeAgent1

  10. WP6: The End-User Query Tool summary • Technical challenges • A logic based framework • Reasoning support • Use of web standards • Innovation • A novel query formation paradigm • The role of the ontology • A linear paradigm for easy query formulation • Multi-language support

  11. Query Management: functional architecture

  12. The Playmaker – reformulation w.r.t. m1 One of the modules of the Brokering Agent (BA): accepts a query and reformulates it according to the semantics of the BA Ontology • The SINode Query Manager – reformulation w.r.t. m2 One of the modules of the SINode Agent: accepts a query and reformulates it according to the semantics of the SINode Ontology, and returns the result to the Query Agent Query Management: the three main components • The Query Agent – coordination of query processing Accepts the query from the End User Query Tool, interacts with both the BA and the SINode Agents, and returns the result to the End User Query Tool Brokering Agent GVV Mapping m1 SINode GVVs Mapping m2 Source Schemata

  13. The playmaker: EXPANDER + UNFOLDER • EXPANDER (by UNIROMA) • Query expansion : The query is expanded by taking into account the constraints in the BA-GVV: all constraints in the ontology are “compiled in” the expansion, so that the expanded query (EXPQuery) can be processed by ignoring constraints – this is the first technique of this kind in the data integration literature, as all other approaches to GAV (Global as View) data integration are based on just unfolding (which is an incomplete technique in our case!) • Subquery identification: Relevant subqueries (EXPAtoms) are extracted from the expanded query. An EXPAtom is a Single Class Query, i.e., a query on a single Global Class of the BA-GVV. • UNFOLDER (by UNIMO) • Query unfolding: Each EXPAtom is unfolded by taking into account the mappings in the BA Ontology, so that it is rewritten w.r.t. the SINode GVVs. The unfolding is performed on the basis of the full disjunction operator, used to perform Object Fusion. The output is a SQL query (FDQuery) which computes the full disjunction;the atoms of FDQuery (FDAtoms) are Single Class Queries over the SINode GVV • Resolution Functions: Resolution Functions, to deal with conflicts among attributes involved in the query, are individuated

  14. Query END USER QUERY TOOL The playmaker: EXPANDER BROKERING AGENT QUERY AGENT PLAY MAKER Query EXPANDER Expanded Query: EXPQuery BAOntology ExpAtoms Librarian UNFOLDER scq1: SELECT CATEGORY_ID FROM Mould_Making scq2: SELECT NAME,COMPANY_ID,CAPITAL_STOCK, REGION,SUBCONTRACTOR,ADDRESS FROM company WHERE CAPITAL_STOCK > 50 AND AND REGION LIKE 'VENETO' AND SUBCONTRACTOR LIKE ’yes’ scq3: ... EXPQuery: SELECT r2.NAME,r2.ADDRESS,r2.NATION FROM scq1 r1,scq2 r2,scq3 r3 WHERE r1.CATEGORY_ID=r3.CATEGORY_ID AND r2.COMPANY_ID=r3.COMPANY_ID UNION SELECT r2.NAME,r2.ADDRESS,r2.NATION FROM scq4 r1,scq2 r2,scq3 r3 WHERE … UNION … SINodeAgent2 SEWASIE_DB SINodeAgent1

  15. END USER QUERY TOOL The playmaker : UNFOLDER scq2: SELECT NAME,COMPANY_ID,CAPITAL_STOCK, REGION,SUBCONTRACTOR,ADDRESS FROM company WHERE CAPITAL_STOCK > 50 AND AND REGION LIKE 'VENETO' AND SUBCONTRACTOR LIKE ’yes’ BROKERING AGENT Full Disjunction: FDQuery: SELECT * FROM FDAtom1 OUTER JOIN FDAtom1 ON (FDAtom1.COMPANY_ID = FDAtom2.COMPANY_ID) QUERY AGENT PLAY MAKER Query Query EXPANDER Expanded Query: EXPQuery BAOntology ExpAtoms Librarian ExpAtoms Unfolding: FDQuery, FDAtoms, ResFunctions UNFOLDER • FDAtom2: • SELECT COMPANY_ID,NAME,REGION, ADDRESS, SUBCONTRACTOR FROM company WHERE ((REGION) like ('VENETO') and (SUBCONTRACTOR) like ('yes')) • FDAtom1: • ... Resolution Function: precedence(${SI-NMAgent2.company.ADDRESS},${SI-NMAgent1.company.ADDRESS}) SewasieRepository

  16. JC(L1,L2) Object Fusion: Object Identification • Object fusion: grouping together information about the same real-object stored in different sources (SINodes). • Merging data from different sources requires different representations of the same real world object to be identified; this process is called object identification • In our system the object identification problem is solved by defining • Join Conditions among classes of the same Global Class. A Join Condition can be a generic expression, defined by using SQL or external functions. • In this prototype a simple equality conditionis implemented. For example: JC(L1,L2) : L1.COMPANY_ID = L2.COMPANY_ID L2 L1 O1 O O2 L1=SN1.Company L2 = SN2.Company

  17. Object Fusion: Full Disjunction • A global class is expressed by means of the full-disjunction of local classes, that has been recognized as providing a natural semantics for data merging queries • Definition of full-disjunction[Rajarama, Ullman - PODS 1996] “Computing the natural outerjoin of many relations in a way that preserves all possible connections among facts” • Given a global class G = { L1, L2, …, Ln }, its instance is the full-disjunction of L1, L2, …, Ln (FDG(L1,L2, …, Ln)) computed on the basis of the Join Conditions L2=SN2.Company L1=SN1.Company FDG(L1,L2) : select S(L1)ÈS(L2) from L1 outer join L2 on JC(L1,L2)

  18. L1 JC(L1,L3) JC(L1,L2) JC(L2,L3) L3 L2 Full Disjunction Computation • Goal : To compute the Full Disjunction by means of an SQL query • [Rajarama, Ullman - PODS 1996] : There is a natural outerjoin sequence producing the full disjunction if and only if the set of relation schemes forms a connected, acyclic hypergraph. • But, a Global Class with more than 2 local classes is a cyclic hypergraph. • Naive evaluation (actual implementation) – Example n = 3select * from L1 outer join L2 on JC(L1,L2))outer join L3on ( JC(L1,L3) OR JC(L2,L3)) • New proposed method : outerjoin pseudo-sequence – Example n = 3select * from (L1 outer join L2 on JC(L1,L2))outer join (L1 outer join L3 on JC(L1,L3)) on JC(L2,L3) • Implementation of methods proposed in literature

  19. Object Fusion: Resolution Functions • Data coming from different SInodes may be inconsistent • Resolution functions: to solve data conflict on an attribute mapped into more than one SINode (instances of the same object coming from different SINodes have different values for local attributes mapped into the same global attribute) • No data conflict : Homogeneous Attribute • An example : precedence(L1.ADDRESS,L2.ADDRESS) Application of the resolution functions

  20. Query unfolding: Local Queries Computation • An EXPAtom is a Query Q on a Global Class G = { L1, L2, …, Ln } Q = select <Q_select-list> from G where <Q_condition> • A FDAtom is a Local Query Q on a Local LQ_L = select <Q_L_select-list> from L where <Q_L_condition> • Constraint Mapping - <Q_L_condition>: • constraints of <Q_condition> which can be solved in L are rewritten w.r.t. L • Residual Constraints - <Q_residual_condition>: • constraints not included in all local <Q_L_condition> • Local Select List - <Q_L_select-list> : attributes of the • <select-list> of Q + residual constraints + Join Conditions

  21. Constraint mapping for Homogeneous Attributes • An atomic constraint (GA op value) is mapped onto the local class L as: (MTF[GA][L] op value) if MT[GA][L] is not null and the op operator is supported into L trueotherwise • An atomic constraint (GA1 op GA2) is mapped onto the local class L as: (MTF[GA1][L] op MTF[GA2][L]) if MT[GA1][L] and MT[GA2][L] are not null and the op operator is supported into L trueotherwise • The current implementation of the prototype assumes that each operator, OP, used in the global query is supported into a local class, i.e. a constraint including OP can be solved in local class.

  22. Query unfolding example Global Class: Company = { SN1.Company, SN2.Company} scq2: SELECT NAME,COMPANY_ID,CAPITAL_STOCK, REGION,SUBCONTRACTOR,ADDRESS FROM company WHERE CAPITAL_STOCK > 50 AND AND REGION LIKE 'VENETO' AND SUBCONTRACTOR LIKE ’yes’ Global Query FDAtom1 SELECT COMPANY_ID,NAME,REGION,ADDRESS,SUBCONTRACTOR FROM SN1.company WHERE (REGION like 'VENETO' and SUBCONTRACTOR like 'yes') Local queries FDAtom2 SELECT COMPANY_ID,NAME,REGION,ADDRESS,SUBCONTRACTOR FROM SN2.company WHERE ( REGION like 'VENETO' and CAPITAL_STOCK > 50 like 'yes')

  23. END USER QUERY TOOL The Query Agent BROKERING AGENT QUERY AGENT PLAY MAKER Query Query EXPANDER Expanded Query: EXPQuery BAOntology ExpAtoms Librarian ExpAtoms Unfolding: FDQuery, FDAtoms, ResFunctions UNFOLDER SINodeAgent2 SEWASIE_DB SINodeAgent1

  24. END USER QUERY TOOL FDAtoms FDAtoms Answer to FDAtoms Answer to FDAtoms The Query Agent : EXECUTION • EXECUTIONFor each FDAtom (Parallel Execution): • INPUT: FDAtom • MESSAGES: from QA to SINode Agent • OUTPUT: a table storing the FDAtom result in the SEWASIE_DB BROKERING AGENT QUERY AGENT PLAY MAKER EXECUTION EXPANDER BAOntology Librarian UNFOLDER SINodeAgent2 SEWASIE_DB SINodeAgent1

  25. END USER QUERY TOOL The Query Agent : FUSION BROKERING AGENT QUERY AGENT PLAY MAKER EXECUTION • FUSIONFor each EXPATom (Parallel Execution): • INPUT : FDAtoms, FDQuery, Resolution Functions • Execution of FDQuery (Full Disjunction of the FDAtoms) • Application of the Resolution Functions on the result of previous action • OUTPUT: a view storing the EXPAtom result in the SEWASIE_DB EXPANDER FUSION BAOntology Librarian UNFOLDER SINodeAgent2 SEWASIE_DB SINodeAgent1

  26. FUSION: Detailed steps • An EXPAtom is a Query Q on a Global Class G = { L1, L2, …, Ln } Q = select <Q_select-list> from G where <Q_condition> • Local queries • For each local class L, local query over L : Q_L • Full Disjunction of the local query answers • Q_FD = FDG(Q_L1, …, Q_Ln) • Resolution Functions applied to Q_FD • Q_FD_RES • EXPAtom result = select <Q_select-list> • from Q_FD_RES • where <Q_residual-condition>

  27. END USER QUERY TOOL The Query Agent : FINAL RESULT BROKERING AGENT QUERY AGENT PLAY MAKER EXECUTION • FINAL RESULT • INPUT : Output of the FUSION step • Execution of the Expanded Query • OUTPUT : Final Query result view stored in the SEWASIE_DB EXPANDER FUSION BAOntology Librarian UNFOLDER FINAL RESULT SINodeAgent2 SEWASIE_DB SINodeAgent1

  28. Query Management: main theoretical features • Technique for GAV (Global-as-view) data integration system structured in two levels • At each level, the semantics of the schema (BA GVV, and SINode GVV, respectively) taken into account by a novel technique (query expansion). First algorithm of this type proved correct (i.e., sound and complete wrt the semantics) • By virtue of the separation between query expansion and query rewriting and evaluation, query processing is polynomial time in data complexity (i.e., with respect to the size of the data at the sources) • The Object Fusion problem is dealt with a novel technique based on the combination of the full disjunction operation and the resolution functions

  29. IST-2001-34825 Technique for query answering in the context of more than one Brokering Agent Maurizio Lenzerini

  30. Brokering Agent Ontology Brokering Agent Ontology Brokering Agent Ontology Mapping Mapping Mapping SINode Global View SINode Global View SINode Global View Mapping Mapping Mapping Data Sources Data Sources Data Sources Problem: How to answer a query posed to a BA ? Query Mapping Mapping Mapping

  31. Peer-to-peer data integration • Query answering in the context of more than one Brokering Agent can be seen as the problem of answering queries in a peer-to-peer data integration system • Peer  Brokering agent • P2P mapping  mapping between BAs • Peer data source  SIN node • Local mapping  mapping between BA and SIN node • One basic problem in P2P data integration is the semantics of P2P mappings

  32. Possible formalizations of P2P mappings

  33. The three main components - see also [Franconi&al ‘04]

  34. Our approach: Epistemic logic semantics

  35. Epistemic logic semantics

  36. Example of the epistemic formalization

  37. The difference between the two semantics

  38. FOL semantics: model 1

  39. FOL semantics: model 2

  40. Epistemic semantics

  41. Distributed query answering algorithm

  42. Current and future work • Algorithm already implemented • Future work: • Testing • Dealing with inconsistency • Dealing with preferences

More Related