40 likes | 122 Views
V LDB. 2. Boris Gelman Vice President Architecture Information Services VISA bgelman@visa.com. 2. V LDB : The Concept. 2. V LDB = Very Very Large Database: New concept or change to VLDB concept ? Data Structure: Petabyte tables with 100s billions of rows Complex table structures
E N D
V LDB 2 Boris Gelman Vice President Architecture Information Services VISA bgelman@visa.com
2 V LDB: The Concept 2 • V LDB = Very Very Large Database: • New concept or change to VLDB concept ? • Data Structure: • Petabyte tables with 100s billions of rows • Complex table structures • Non-uniform physical data representation of petabyte tables • Query: • Well-defined subsets (index and/or partition) on tables: small (~10,000) -> medium (~300,000) -> large (~1,000,000) • Undefined subsets: very large (~1,000,000,000) -> very very large (~100,000,000,000) • Complex joins • Complex group by’s and sorts • Workload: • Multiple categories of queries running concurrently (transaction research, analytics, data mining) • Inserts and selects concurrently against the same tables • 24 * 7 operation with very limited maintenance windows • SLAs are very strict
2 V LDB: Problems • Data Partitioning: • Smart partitioning: hash, expression, … -> hybrid multi-level partitioning • Smart partition manipulation: detach / attach partition online • Query Execution: • Hash join on petabyte tables ? • Performance Tuning does not work: • Adaptive and buffer-pool aware query optimization ? • System-category aware query optimization ? • Optimizer efficiency ? • Backup/Restore does not work: • Data replication is not a substitute for backup: data corruption, application errors, human errors • Smart backup/restore related to smart data partitioning !
2 V LDB: Problems • Database Federation: • Single database system cannot hold a combination of ODS (> 1 PB) and cross-functional multi-subject DW (> 200 TB) - it is impractical • Data Abstraction Layer: federated tables partitioned across multiple database systems! • Federated Database is easier to maintain and backup, and availability is higher! • Federated Database Performance = Single Database System Performance !!!