160 likes | 282 Views
Massively Distributed Database Systems - Distributed DBS. Spring 2014 Ki- Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University. Pros and Cons. Software Cost and Complexity Processing Overhead Data Integrity Slow Response Disadvantages. Reliability and Availability
E N D
Massively Distributed Database Systems- Distributed DBS Spring 2014 Ki-Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University
Pros and Cons Software Cost and Complexity Processing Overhead Data Integrity Slow Response Disadvantages Reliability and Availability Local Control Incremental Growth Communication Costs Fast Response Advantages
3-layer model of databases View Definitions External Layer Modeling Conceptual Schema Conceptual Layer Data Storage Format Physical Layer Implementation - Systems
Distributed Databases as a Logical Layers View from client View from client View from client Global Database Global Conceptual Layer ?? Global Physical Layer Local Database Local Database Local Database External Layer External Layer External Layer Conceptual Layer Conceptual Layer Conceptual Layer Physical Layer Physical Layer Physical Layer
Issues Replication vs. Partitioning Distributed DBMS Transparency Query Optimization Transaction Management
Replication vs. Partitioning • Replication • Partitioning • Vertical vs. Horizontal • Hybrid
Replication Site 2 DB-1 DB-2 Site 1 DB-1 DB-2 DB-3 Site 2 DB-2 DB-3 Replicate all or parts of databases to local DB
How to manage replicated DBs? • Issue 1 – Consistency • If updates at a site, how to manage the integrity of global databases • Issue 2 – How to duplicate • All or only some parts • Factors to consider
Replication – Management of Consistency • Snapshot replication • Store all update logs at a central site from a given time • Periodically send proper logs to local sites • Each local site reflects the update logs for its local DB • Near Real-Time replication • When an update occurs, it triggers updates at other sites • Pull Replication • Instead of push protocol, each local site asks update logs when it is necessary
Replication – Management of Consistency • Exclusive ownership vs. Shared ownership • Single update vs. Multiple update • Synchronous updates vs. Asynchronous update • Simple snapshot vs. Multiple snapshot
Replication – How to replicate Fast Response Communication Overhead Security Query Optimization
Partitioning – Horizontal Partitioning Split a table into several subtables
Partitioning – Horizontal Partitioning • How to split a table? • Efficiency • Local Optimization • Communication Overhead • Security • Dynamic reconfiguration of Partitioning
Partitioning – Vertical Partitioning Split a DB into several disjoint tables Shared Primary Keys – Join operations are inevitable
Distributed DBMS • What a distributed DBMS should do • Management of Data Dictionary • Resolving Heterogeneity: Schema, QL, DBMS • Keeping distributed DBs secure and consistent: TM • Transparency: single logical view to user • Dynamic load balancing • Query processing (Optimization)