380 likes | 475 Views
C-Store: Data Model and Data Organization. Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 6, 2009. Data Model: standard relational model. A database: A set of named relations (tables) A relation A set of named attributes (columns) Primary key
E N D
C-Store: Data Model and Data Organization Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 6, 2009
Data Model: standard relational model • A database: • A set of named relations (tables) • A relation • A set of named attributes (columns) • Primary key • A set of attributes whose values uniquely identify a row (tuple) in a relation. • Foreign key • References a primary key in another relation.
How to Implement a logical relation? • Given a logical relation R • Row Store • R has a direct physical correspondence. • Column Store (C-Store) • R may not have a direct physical correspondence. • Covered by a set of C-Store Projections.
C-Store Projection • Is anchored on a given logical table T. • Contains one or more attributes from T. • Can also contain any number of attributes from other tables. • Attributes from other tables must be referenced by a sequence of n:1 (i.e., foreign keys) relationships from T.
How to Construct a Projection? • Given a logical table T • Extract the attributes of interest from T. • Retain any duplicate rows • In standard Projection, duplicates are deleted. • Obtain the attributes of interest from other tables • Perform foreign-key joins.
Sort Key • Tuples in a projection are stored column-wise. • K attributes K columns • Each column is sorted on the same Sort Key. • Sort Key • Can be any column or columns in a projection
Horizontal Partition of a Projection • Each projection is horizontally partitioned into 1 or more segments. • SID > 0, Segment identifier • Value-based partitioning on the Sort Key of a projection • Each segment has a key range of the sort key. • The set of all key ranges partitions the whole space of sort key.
How to Re-construct a Complete Tuple? • To answer any SQL query • There must be a covering set of projections for every logical table T. • Every column in T is stored in at least one projection. • To re-construct a tuple, Join segments from different projections • Using Storage Keys and Join Indexes
Storage Keys • A segment, horizontal fragment of a projection • May involve several coulmns • Associate every data value of each column with a Storage Key, SK. • In the same segment, data values from different columns with matching SK belong to the same logical tuple. • A SK is simply the ID of a logical tuple.
Storage Keys in Read Store (RS) • SKs are numbered 1,2 ,3,... • SKs are not physically stored • But are inferred from a tuple’s physical position in a column. • Why SKs can be inferred? • All the columns in the same segment are sorted on the same Sort Key.
Storage Keys in Write Store (WS) • SKs are physically stored • SKs are represented as integers. • Each SK in WS is larger than the largest SK for any segment in RS.
join indexes: A Mapping Table • Suppose T1 and T2 are two projections that cover a logical table T. • For each segment of T1 , build a join index to T2 • is a Table with rows : (s: SID in T2, k: Storage Key in Segment s)
Ordering of Data in Each Column • Self-order • the column is ordered by values in itself. • Foreign-order • the column is ordered by values in other column in the same projection. • A projection may involve several columns • Example • Projection EMP1(name, age| age) • Column age is Self-order • Column name is foreign order
Column Type in Read Store(RS) • Type 1: Self-order, few distinct values • Type 2: Foreign-order, few distinct values • Type 3: Self-order, many distinct values • Type 4: Foreign-order, many distinct values
Compression of Columns in Type 1: Self-order, few distinct values • Represented by a sequence of triples (v, f, n) • v: a data value stored in the column • f : the position in the column where vfirst appears. • n: the number of times vappears in the column • Example: 4’s appear in positions 12-18, is recorded as (4, 12, 7) • One triple is required for each distinct value in the column.
Fast Search over Columns in Type 1 • Using Clustered B-Tree (i.e., Primary B-Tree) • Densepack the B-Tree • No on-line updates in RS • Using large disk blocks • 64-128K • The height of this index can be kept small (2 or less)
Compression of Columns in Type 2: Foreign-order, few distinct values • Represented by a sequence of tuples (v, b) • v: a data value stored in the column • b: a bitmap indicating the positions in which v is stored. • Use Run-Length Encoding to compress bitmaps.
Fast Search over Columns in Type 2 • To efficiently find the i-th value in a column, • Use Offset indexes • B-Trees that map positions in a column to the values contained in that column.
Compression of Columns in Type 3: Self-order, many distinct values • Basic Idea • Represent each value in the column as a delta from the previous value. • Use block (64-128K) as the unit of compression. • Like compressing an Inverted Index
Fast Search over Columns in Type 3 • Use densepacked B-Tree at the block-level • A block is viewed as a tuple.
Compression of Columns in Type 4: Foreign-order, many distinct values • Two choices at the moment • Do not compress • Use densepacked B-Tree • Open for further research
Write Store (WS) • WS is also a Column Store • To avoid writing two query optimizers • Implements the identical physical DBMS design as RS. • The Major Difference due to Updates in WS • Storage representation
How to Do Updates? • Update • Insert a new tuple • Modify an existing tuple • Delete an (existing) tuple • Update depends on first doing a query • Insert: need to check Primary Key Constraint etc. • Modify: need to first find the specified tuple • Delete: need to first find the specified tuple
Queries in C-Store • Use some known values of some columns as conditions to search values on other columns • Example table: • EMP1(name, age| age) • Example query: • use age values to query name values. • Tuples in C-Store are stored column-wise • We need to keep track of all column values of the same logical tuple • The Idea: Storage Key as the tuple ID.
How to Speed-up Queries in C-Store? • The Basic Idea: • Pre-sort every projection in some SORT KEY order • Keep several copies of the same projection, each copy is pre-sorted in different sort key order. • For a given query, use a copy of the projection whose order matches with the query conditions. • Therefore queries in C-Store usually equal to searches via sort key. • Note: Each Projection in WS has only one copy
Storage Key in WS • The Storage Key, SK, for each tuple is explicitly stored in each WS segment. • Columns in WS only keep a logical SORT KEY order via SKs. • Delay physical sorts on SORT KEY order when tuples are moved from WS to RS. • A unique SK is given to each insert of a logical tuplerin a table T. • SK is a serial number generated by system. • The SK of rmust be recorded in each projection that stores data for r.
Storage Representation of Columnsin WS • Every column in a Projection • Represented as a collection of (v, sk) pairs • v :a data value in the column • sk :the storage key (explicitly stored) • Build a B-Tree over each column • Use the second field of each pair, sk, as the KEY
Sort Keys of Each Projection in WS • Represented as a collection of (s, sk) pairs • s:a sort key value • sk :the storage key describing where s first appears. • Build a B-Tree over the (s, sk) pairs • Use the first field of each pair, s, as the KEY
Searches via Sort Key: Two Steps • 1st Step,use the B-Tree over the (s, sk) pairs • Search condition: known sort key values • Search result: corresponding storage keys • 2nd Step, use the B-Tree over the each column (i.e., the (v, sk) pairs) • Search condition: storage keys found in 1st Step. • Search result: data values in the column.
Partition of a Projection in WS • Partitioning a Projection in the same way as in RS • Equals to partitioning the space of sort key • 1:1 mapping between RS segments and WS ones. • A tuple is identified by a (sid, storage_key) pair in either RS or WS • sid: segment ID • Storage_key: the SK of the tuple
General Picture of WS • Usually is very small, can be fully stored in Memory • Data processing is fast in memory • Do not compress data, represent data directly • Each projection uses B-Tree indexing to maintain a logical sort-key order. • The B-Trees are secondary B-Tree.
References • Mike Stonebraker, Daniel Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran and Stan Zdonik. C-Store: A Column Oriented DBMS , VLDB, 2005. (http://db.csail.mit.edu/projects/cstore/vldb.pdf) • VERTICA DATABASE TECHNICAL OVERVIEW WHITE PAPER. http://www.vertica.com/php/pdfgateway?file=VerticaArchitectureWhitePaper.pdf