320 likes | 479 Views
Distributed Database Management Systems. Reading. Textbook: Ch. 4. Design Issues. Placing of data and programs (DBMS and application) Network issues. Level of Sharing. No sharing Data sharing Data and program sharing. Heterogeneous environment!. Top-Down Design.
E N D
Reading • Textbook: Ch. 4 CSCE 824 - Spring 2011
Design Issues • Placing of data and programs (DBMS and application) • Network issues CSCE 824 - Spring 2011
Level of Sharing • No sharing • Data sharing • Data and program sharing Heterogeneous environment! CSCE 824 - Spring 2011
Top-Down Design • Global Conceptual schema distribution • Fragmentation • Replication • Allocation • Figure 3.2 CSCE 824 - Spring 2011
Correctness of Fragmentation • Completeness: FR={R1, …, Rn} • Reconstruction: R=Ri, RiR • Disjointness: • Horizontal: does not djRisuch that djRkwhere ki • Vertical: same as horizontal for non-primary key attributes 1&2: Lossless-join (normalization) CSCE 824 - Spring 2011
Data Directory • Global vs. local conceptual schemas • How to search? • Where to store? • Single vs. multiple copies? CSCE 824 - Spring 2011
Current Research • Allocation: new requirements, technology, etc. • Where to store the fragments? • Dynamic environment • Usage pattern • Application characteristics • Network changes • Security CSCE 824 - Spring 2011
Bottom-Up Approach • Multi-database systems • How to integrate them into 1 database? • Interoperability CSCE 824 - Spring 2011
Database Integration • Physical integration • Materialized database: data warehouses • Extract-transform-load (ETL) tools • Logical integration • Virtual (not materialized) integration • Enterprise Information Integration CSCE 824 - Spring 2011
Data Warehouses • On-line Analytical Processing (OLAP) applications: • Decision support systems • Trend analysis and forecasting • Complex queries, large databases • Materialized view maintanence CSCE 824 - Spring 2011
Logical Integration • No materialized global database • Virtual integration: data remains at the local (operational) databases • Global conceptual schema may not contain everything from local schemas • Autonomous and heterogeneous local systems CSCE 824 - Spring 2011
Bottom-Up Design • Global Conceptual Schema (GCS or mediated schema) • Defined first: local conceptual schemas (LCS) are mapped to GCS • Defined during the integration of the LCSs and develop the corresponding mappings from LCSs to the GCS CSCE 824 - Spring 2011
GCS Defined First • Local-as-view (LAV) systems • Each LCS is treated as a view over the GCS • Query results: constrained to the objects in the local DBs while the GCS definition may be richer • Potential incomplete answers • Global-as-view GCS is defined as a set of views over the LCSs • View definition defines how to derive elements of the GCS • Query results: constrained to the GCS while the local DBs might be richer CSCE 824 - Spring 2011
Design Tasks • Schema translation • Schema generation • Figure 4.3 CSCE 824 - Spring 2011
Intermediate Canonical Representation • Expressive to incorporate all concepts in the local databases • Simple, intuitive, practical, etc. • Example: E/R model, relational model, graph/tree models, etc. • Tools CSCE 824 - Spring 2011
Schema Generation • Schema matching: syntax and semantics • Integration of common schema elements • Schema mapping • See example 4.1, 4.2 CSCE 824 - Spring 2011
Schema Matching • Defined or discovered (e.g., web data) • Rules: • Correspondence between 2 elements • Predicate whether the correspondence holds or not • Similarity value between the 2 elements CSCE 824 - Spring 2011
Finding Correspondence • Difficult process due to schema heterogeneity • Can be automated? • Insufficient schema and instance information • Unavailability of schema documentation • Subjectivity of matching CSCE 824 - Spring 2011
Matching Algorithm Issues • Schema vs. instance matching • Concept match • Data instance: semantic inconsistencies • Element-level vs. structure-level mapping • Element name semantics • Multiple attribute mapping? • Matching cardinality • One-to-one, one-to-many, many-to-many CSCE 824 - Spring 2011
Semantic Schema Heterogeneity • Semantic: meaning, interpretation, and intended use of data • Synonyms, homonyms, hypernyms • Different ontologies • Imprecise wording CSCE 824 - Spring 2011
Structural Schema Heterogeneity • Type conflict: attribute vs. entity • Dependency conflict: mapping cardinality inconsistencies • Key conflict: different primary keys • Behavioral conflict: modeling assumptions, e.g., referential integrity, deletion, etc. Farkas CSCE 824 - Spring 2011 22
Schema Integration • Binary • N-ary CSCE 824 - Spring 2011
Schema Mapping • How the data from local databases can be mapped to GCS • Mapping creating • Mapping maintanence CSCE 824 - Spring 2011
Mapping Creation • Input: LCS, GCS, M (schema matches) • Output: Q={Q1, …, Qk} such that • DBGCS = Q(DBCLS) CSCE 824 - Spring 2011
Security Objectives • Confidentiality • Integrity • Availability CSCE 824 - Spring 2011
Question 1 • How distributed databases impact the security objectives? • Confidentiality in traditional vs. distributed DBs • Integrity in traditional vs. distributed DBs • Availability in traditional vs. distributed DBs CSCE 824 - Spring 2011
Integrity • Correctness criteria • Top-down design • Bottom-up design CSCE 824 - Spring 2011
Availability • What are the issues related to availability when dealing with • Top-down design • Bottom-up design CSCE 824 - Spring 2011
Confidentiality • (will be covered in 2nd part of semester but…) • Centralized vs. distributed security policy • Top-down design • Bottom-up design CSCE 824 - Spring 2011
Next Class Semantics-based Database Integration CSCE 824 - Spring 2011