110 likes | 217 Views
HEAVEN A Hierarchical Storage and Archive Environment for Multidimensional Array-DBMS. Bernd Reiner reiner@forwiss.tu-muenchen.de. set of multidimensional tiles tile = subarray. tiles stored in relational DBMS BLOBS multidimensional index (R+ tree). Index. Access to subsets of MDDs.
E N D
HEAVENA Hierarchical Storage and Archive Environment for Multidimensional Array-DBMS Bernd Reiner reiner@forwiss.tu-muenchen.de
set of multidimensional tiles • tile = subarray • tiles stored in relational DBMS BLOBS • multidimensional index (R+ tree) Index Access to subsets of MDDs Multidimensional query language RasQL Array DBMS • Multidimensional object (MDD)
Motivation • Increasing amount of data (up to Petabyte) • Hard disks too small/expensive to hold hundreds of Terabytes • Typically data stored as files on Hierarchical Storage Management Systems (HSM-System, e.g. Tapes) • DBMS only used for Metadata • With the multidimensional array DBMS RasDaMan only subsets must be transferred instead of whole MDDs Include archived data in DBMS data access
Client RasQL Hierarchical StorageManagement System RasDaManServer Tertiary Storage Manager File Storage Manager SQL DBMS Oracle / DB2 Offline Storage HSM HSM migrate migrate export export import import DBMS (on HDD) Cache Cache stage stage Nearline Online Offline System Architecture MultidimensionalArray DBMS
Optimization • Minimization of tape access operations • Tiling, Object-Framing, Caching • Minimization of media exchange operations • Clustering, ordered Query-Queue, “lazy eject” • Minimization of positioning time • Clustering, ordered Query-Queue • Parallelization • Inter, intra object parallelization Publications: VLDB 2002, DEXA 2002, DEXA 2003
One Tile Super-Tile algorithm export Tile 1 Tile 2 Tile 3 Tile 4 ST-4 ST-1 ST-2 ST-3 Magnetic Tape Export to Tertiary Media Preserves multidim. clusteringon Tape
compute Super-Tiles RasDaMan viewer ImportSuper-Tiles Import from Tertiary Media
Object-Framing • Reducing tape access
Data Retrieval from HSM Partitioning of data random Object: mpim4d (1,35 GByte) Positioning Time (sec.) Read data DLT4000average accesstime 68s Super-Tile-No. (48 MByte)
Data Retrieval from HSM Partitioning of data Super-Tile clustering Object: mpim4d (1,35 GByte) Positioning Time (sec.) Read data DLT4000average accesstime 68s Super-Tile-No. (48 MByte)
Clustering vs. Random order Time (sec.)