240 likes | 376 Views
Overview of Database Federation and IBM Garlic Project. Presented by Xiaofen He. Reference. Data Integration through database federation, L.M. Haas, E.T.Lin, M.A. Roth Towards Heterogeneous Multimedia Information Systems: The Garlic Approach, IBM Almaden Research Center. Outline.
E N D
Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He
Reference • Data Integration through database federation, L.M. Haas, E.T.Lin, M.A. Roth • Towards Heterogeneous Multimedia Information Systems: The Garlic Approach, IBM Almaden Research Center
Outline • Approaches to data integration • Database Federation in IBM DB2 • IBM Garlic Project
Various Approaches to Data Integration (1) • Application-specific solutions • Always works • Expensive, fragile and hard to extend • Application-integration frameworks • Protection from changes of data source • Do not address data integration issues • Workflow frameworks • Limited support for comparing and manipulating
Various Approaches to Data Integration (2) • Digital libraries • Meta search engine • No combination of data • Data warehousing • Powerful, high-level query language • May not be possible or cost effective, loss of functionality • Database federation • Virtual data warehouse • Performance tradeoff (query rewrite & cost-based optimization)
Database Federation • Basics of Database Federation • DB2 styles of database federation • Determining the style of database federation to use
Basics of Database Federation • What is ‘database federation’ (DF) • Aka. ‘mediation’ • An architecture in which middleware, consisting of a relational database management system, provides uniform access to a number of heterogeneous data sources
Common Mediation Architecture • Data Source • Wrapper • Mediator Figure 1. Common Mediator Architecture
Goals of IBM DF • Transparency • Support heterogeneity • A high degree of function • Extensibility • Openness • Autonomy of individual data sources • Query optimization
DB2 architecture for DF Figure 2. DB2 architecture for database Federation
DB2 Styles of federation • Scalar UDFs: Federating function • Table UDFs: Federating data • Wrappers: Federating function and data Figure 3. Different styles of federation
Wrapper Architecture • Multi-server integration • Multi-dataset integration and multi-operation integration • Optimization • Transactional integration
Determining the style of DF to use Figure 4. Determine the style of federation to use
IBM Garlic Project • Introduction • Overview • Architecture • Repositories and Databases • The Garlic Data Model • Queries in Garlic • Interface and Application • Conclusion
Introduction • Need • Goal • Object-Oriented Model
Garlic Overview C++ Application Query/Browser Query Services & Runtime System Metadata Repository Repository Wrapper Repository Wrapper Repository Wrapper Repository Wrapper Data Repository Data Repository Complex Object Repository Data Repository Figure 5. Garlic System Architecture
Garlic Overview • Repositories • Repository type • Repository instance • Repository manager • Databases • Global schema • Wrapper schemas (local schemas)
Garlic Data Model (1) • ODMG-93 object model • Objects and values • Inheritance • Object identity • Weak identity – unique, not necessarily immutable • Legacy references • Implementation-constrained reference
Garlic Data Model (2) • Extensions • Degree of support for alternative implementations of interfaces • Type system flexibility - conformity • Object-appropriate view definition facility • Object-Centered Views • Enhance objects by adding or hiding some of their attributes/methods.
Queries in Garlic • Query language • Object-oriented extension of SQL • Integrating approximate match query semantics with traditional exact match query semantics. • Query Processing • Decomposition • Interesting Question • How to characterize the query power of a repository, in terms of the language subset that its wrapper is capable of processing directly
Interfaces and Applications • C++ API • Compiled applications • Dynamic applications • Query/Browser • A dynamic application • Moving back and forth between querying and browsing activities
Summary • Database Federation • A powerful tool for integrating data • Future work • to improve the ease of use • Enhance the performance • Garlic Project • New research in many dimensions