180 likes | 291 Views
Infomaster: An information Integration Tool. O. M. Duschka and M. R. Genesereth Presentation by Cui Tao. Introduction. Huge amount of information online: Distribution: Not every query can be answered by the data in a single database Fragmentation: horizontal, vertical Heterogeneity
E N D
Infomaster: An information Integration Tool O. M. Duschka and M. R. Genesereth Presentation by Cui Tao
Introduction • Huge amount of information online: • Distribution: Not every query can be answered by the data in a single database • Fragmentation: horizontal, vertical • Heterogeneity • Notational heterogeneity: • Different access language and protocol: Parsing HTML, SQL, OQL, Z39.50 • Conceptual heterogeneity: • Semantic mismatches • Instability
Introduction • Intelligent agents • Search and find desired information • Convert formats • Translate different context • Etc… • Not feasible yet • Considerable research in ontologies and natural language understanding is required
Introduction • Infomaster: an information integration tool • Provide integrated access • Manage evolving information sources • Add new information sources • Remove outdated information sources
Tested Application Areas • Newspaper classifieds • Provide a uniform search interface • Gather corresponding classifieds from all relevant newspapers • Product catalogs • Provide terminology translation • Campus databases
Interface Base Descriptions of Relationships • Interface relation & Site relation: in the terms of Base relation • Interface relation v.s. Base relation:
Base Site Base Descriptions of Relationships • Site relation v.s. Base relation:
Base Site Base Descriptions of Relationships • Site relation v.s. Base relation:
Query Processing Example: BMWs built in 1996 that are for sale for a Price below their average market value.
Reduction: Interface relations Base relations • Simple: User’s query --- Interface relation --- Base relation • Example rewritten query:
AbductionBase relations Site relations • Site relations are expressed in terms of base relations, but not vice versa • Query rewritten problem: answer queries using views • Abduction: use a standard model elimination theorem prover
AbductionBase relations Site relations : The set of all descriptions of the site relations : A set of site relations : The rewritten user query after the reduction step
AbductionBase relations Site relations • The example query plans:
Optimization Assume: All ads in sjmn are in sfc
Conclusions • The first integration system: • Arbitrary positive relational algebra user queries • DB description • Efficient optimization by use: • Integrity constraints • Local completeness information • Flexible Use of query planning: • Expressive description language • Constraint • Background theories
Related Works • Information Manifold project and SIMS project: • Explore the use of descriptions logics for describing information sources • Occam project • Use general AI planning techniques to generate information gathering plans • TSIMMIS project • Use pattern matching techniques to match user queries and predefined queries.