250 likes | 898 Views
Distributed Database Management Systems. Evolution of DDBMS. Decentralized database management systems (DDBMS) Interconnected computer systems Data/processing functions reside on multiple sites 1970’s: Centralized DBMS 1980’s: Social and Technical Changes Ad hoc capability required
E N D
Evolution of DDBMS • Decentralized database management systems (DDBMS) • Interconnected computer systems • Data/processing functions reside on multiple sites • 1970’s: Centralized DBMS • 1980’s: Social and Technical Changes • Ad hoc capability required • Decentralized management structure common • 1990’s: New forces • Internet and the World Wide Web used for data access and distribution • Data analysis through data mining and data warehousing
DDBMS Advantages • Data located near site with greatest demand • Faster data access • Faster data processing • Growth facilitation • Improved communications • Reduced operating costs • User-friendly interface • Less danger of single-point failure • Processor independence
DDBMS Disadvantages • Complexity of management and control • Security • Lack of standards • Increased storage requirements • Greater difficulty in managing data environment • Increased training costs
Distributed Processing Shares database’s logical processing among physically, networked independent sites
Distributed Database Stores logically related database over physically independent sites
Distributed Database vs. Distributed Processing • Distributed processing • Does not require distributed database • May be based on a single database on single computer • Copies or parts of database processing functions must be distributed to all data storage sites • Distributed database • Requires distributed processing • Both • Require a network to connect components
Functions of DDBMS • Application/end user interface • Validation to analyze data requests • Transformation to determine request components • Query optimization to find the best access strategy • Mapping to determine the data location • I/O interface to read or write data • Formatting to prepare the data for presentation • Security to provide data privacy • Backup and recovery • DB Administration • Concurrency Control • Transaction Management
DDBMS Components • Computer workstations • Network hardware and software components • Communications media • Transaction processor (TP) • Also called application manager (AP) or transaction manager (TM) • Data processor (DP) • Also called data manager (DM)
Distributed Database Components Figure 10.5
DDBMS Protocols • Interface with network to transport data and commands between DPs and TPs • Synchronize data received from DPs and route to appropriate TPs • Ensure common database functions • Security • Concurrency control • Backup and recovery
Levels of Data and Process Distribution Database systems can be classified based on process distribution and data distribution Table 10.1
Single-Site Processing, Single-Site Data (SPSD) • All processing on single CPU or host computer • All data are stored on host computer disk • DBMS located on the host computer • DBMS accessed by dumb terminals • Typical of mainframe and minicomputer DBMSs • Typical of 1st generation of single-user microcomputer database
Multiple-Site Processing, Single-Site Data (MPSD) • Requires network file server • Applications accessed through LAN • Variation known as client/server architecture
Multiple-Site Processing, Multiple-Site Data (MPMD) • Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites • Homogeneous I • Integrate one type of centralized DBMS over the network • Heterogeneous • Integrate different types of centralized DBMSs over a network
Distributed DB Transparency • Allows end users to feel like only database user • Hides complexities of distributed database • Transparency features • Distribution • Transaction • Failure • Performance • Heterogeneity
Distributed Concurrency Control • Multisite, multiple-process operations more likely to create data inconsistencies and deadlocked transactions • Problems • Transaction committed by local DP • One DP could not commit transaction’s result • Yields inconsistent database
Two-Phase Commit Protocol • DO-UNDO-REDO protocol • Write-ahead protocol • Two kinds of nodes • Coordinator • Subordinates • Phases • Preparation • Coordinator sends message to all subordinates • Confirms all are ready to commit or abort • Final Commit • Ensures all subordinates have committed or aborted
Performance Transparency and Query Optimization • Objective: Minimize total cost associated with execution of request • Main costs • Access time • Communication • CPU time • Basis for query optimization algorithms • Optimum execution order • Sites accessed to minimize communication costs • Dynamic or static optimization • Statistically based vs. rule-based query optimization algorithms
Distributed Database Design • Partition database into fragments • Horizontal • Vertical • Mixed • Fragments to replicate • Storage of data copies at multiple sites • Fully, partially, unreplicated databases • Data allocation • Where to locate data • Centralized, partitioned, replicated
Client/Server Advantages Over DDBMS • Client/server less expensive • Client/server solutions allow use of microcomputer’s GUI • More people with PC skills than mainframe skills • PC is well established in workplace • Numerous data analysis and query tools exist • Considerable cost advantages to off-loading application development
Client/Server Disadvantages • Creates more complex environment with different platforms • Increased number of users and sites creates security problems • Training issues become more complex and expensive