200 likes | 298 Views
Why not use Federated approach for Database Management System (DBMS)?. Yan Cui ITK478 Position paper. crucial issues in enterprises.
E N D
Why not use Federated approach for Database Management System (DBMS)? Yan Cui ITK478 Position paper
crucial issues in enterprises • “…organizations merge or takeover since the existing systems have been designed for different corporate needs, the resulting enterprise will have to face information inconsistency, heterogeneity and incompatible overlap”. Wijegunartne, Fernandez and Vltoudis in [1] • “…a large modern enterprise, it is also inevitable that …use different database systems to store and search their critical data. Competition, evolving technology, mergers, acquisitions, geographic distribution, and … decentralization of growth…” Haas and Lin in [2]
DBMS approaches • Compare two major approaches • Federated database system approach • Distributed database system approach • Comparison in their architectures/designs, transparency, integration, autonomy, and others.
Distributed DBMS • Definition of Distributed database (DDBS) and Distributed Database Management System (DBMS) • Centralized and distributed databases conversion • Distributed DBMS design
Distributed DBMS (cont) • Definition of Distributed database (DDBS) and Distributed Database Management System (DBMS) • Distributed database – “a collection of multiple, logically interrelated database distributed over a computer network” by M. Özsu and P. Valduriez in [1] • Distribute DBMS – “as the software system that permits the management of the DDBS and makes the distribution transparent to the users” by M. Özsu and P. Valduriez in [1].
Distributed DBMS (cont) • Centralized and distributed databases conversion • Distributed DBMS is more “local autonomy, improved performance, improved reliability/availability, economics, expandability, and shareability” [3]. Fig. 1 - Central Database on a Network [3] Fig. 2 - DDBS Environment [3]
Distributed DBMS (cont) • Distributed DBMS design - in [4] by F. A. Baião, M. Mattoso and G. Zaverucha, defined “Distribution design involves making decisions on the fragmentation and placement of data across the sites of a computer network” • Fragmentation • Allocation
Distributed DBMS (cont) • Distributed DBMS design – Fragmentation • Defined as “clustering fragments the information accessed simultaneously by applications” [4]. • vertical fragmentation • horizontal fragmentation • mixed fragmentation
Distributed DBMS (cont) • Distributed DBMS design – Fragmentation • horizontal fragmentation - class instances are distributed across fragments, and also a horizontal fragment of a class contains a subset of the whole class extension [4] • Primary (Round-Robin, Hash-partition, and Rang-partition) • Derived fragment Fig. 5 - Range partition [5] Fig.3 - Round-robin [5] Fig. 4 - Hash-partition [5]
Distributed DBMS (cont) • Distributed DBMS design – Fragmentation • horizontal fragmentation • Derived fragment Fig. 5 - Range partition [5]
Distributed DBMS (cont) • Distributed DBMS design – Fragmentation • horizontal fragmentation - distribute attributes and methods across fragments, as fragment 1(name, GPA) and fragment 2(address, bDate, picture) from student class in Fig. 7 • mixed fragmentation – combination of vertical and horizontal fragmentations Fig. 7 – Vertical fragmentation [5] Fig. 8 – Mixed fragmentation [5]
Distributed DBMS (cont) • Distributed DBMS design – Allocation • by M. Özsu and P. Valduriez in [3] is to distribute all resources/fragments across the nodes/sites of a computer network.
Federated DBMS • Definition • all data sources are federated and linked together from heterogeneous DBMSs, different locations, relevant/irrelevant and structure/non-structure data, into a unified system by DBMS by L.M. Haas, E.T. Lin and M.A. Roth in [6]. • Characteristics of federated DBMS • transparency, heterogeneity, a high degree of function, extensibility, openness, autonomy, and optimized performance in [2,6].
Federated DBMS • DB2 architecture for database federation • user-defined function (UDF) (Scalar and Table UDFs) • Wrapper Fig. 9 – DB2 architecture of database federation [6]
Federated DBMS • DB2 architecture for database federation • UDF - take input parameters and return either a scalar result or a table of data. • Scalar UDF - takes SQL statement as input and returns a scalar result. • Table UDF - is the other method which produces table as output from any referenced SQL statements.
Federated DBMS • DB2 architecture for database federation • Wrapper - as “powerful and flexible infrastructure for federation” in [6] because it integrates both scalar UDF function and Table UDF data
CONCLUSION/POSITION • the disadvantages of distributed DBMS are complexity, economic, difficulty to maintain data integration, database access [3]. • federated database system provides transparency, autonomy, optimized performance, accessibility, and query standard through multiple DBMSs • an efficient way to integrate multiple DMBSs if enterprises merging or using different DBMSs, and provide data sharing and processing efficiently throughout the enterprises.
Reference • [1] I. Wijegunaratne, G. Fernandez, J. Valtoudis. 2000. “A Federated Architecture for Enterprise Data Integration”, 2000 Australian Software Engineering Conference. Retrieved September 12, 2007. (http://portal.acm.org.proxy.lib.ilstu.edu:2048/citation.cfm?id=787253&coll=Portal&dl=GUIDE&CFID=5277637&CFTOKEN=95867344) • [2] Laura Haas, Eileen Lin, 2002 “IBM Federated Database Technology”, IBM, retrieved September 10, 2007 (http://www.ibm.com/developerworks/db2/library/techarticle/0203haas/0203haas.html) • [3] M. Özsu and P. Valduriez, Principles of Distributed Database Systems, 2nd edition (1st edition 1991), New Jersey, Prentice-Hall, 1999. • [4] F.A. Baião , M. Mattoso , G. Zaverucha. 1998. “Towards an Inductive Design of Distributed Object Oriented Databases”. Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems, p.188-197, August 20-22. Retrieved September 28, 2007 from http://csdl.computer.org/dl/proceedings/coopis/1998/8380/00/83800188.pdf. • [5] F. Baião, M. Mattoso, G. Zaverucha. “An Algorithm for the Design of Distributed Object Databases” PowerPoint. Retrieved September 14, 2007. From http://www-db.cs.wisc.edu/dbseminar/spring00/talks/fernanda_slides.pdf. • [6] L.M. Haas, E.T. Lin, M.A. Roth. 2002. “Data integration through database federation”. IBM Systems Journal, Volume 41 , Issue 4, retrieved October 1, 2007 from http://www.research.ibm.com/journal/sj/414/haas.pdf.