Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications

Carnegie Mellon Univ.Dept. of Computer Science15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores

Administrivia • HW7 Phase 1: Wed Nov 9th • HW7 Phase 2: Mon Nov 28th • HW8: Mon Dec 5th • Final Exam: Tue Dec 13th @ 5:30pm • Exam will be held in twolocations. We will send an email with your assigned room. CMU SCS 15-415/615

Today’s Class • Storage Models • System Architectures • Vectorization • Compression • Data Modification CMU SCS 15-415/615

Wikipedia Example CREATETABLE useracct ( userID INT PRIMARY KEY, userName VARCHAR UNIQUE, ⋮ ); CREATETABLE pages ( pageIDINT PRIMARY KEY, title VARCHAR UNIQUE, latest INT REFERENCES revisions (revID), ); CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), userIDINT REFERENCES useracct (userID), content TEXT, updated DATETIME );

OLTP • On-line Transaction Processing: • Short-lived txns. • Small footprint. • Repetitive operations. SELECT* FROMuseracct WHEREuserName= ? AND userPass = ? SELECTP.*, R.* FROM pages ASP INNERJOINrevisions ASR ONP.latest= R.revID WHEREP.pageID= ? UPDATEuseracct SET lastLogin = NOW(), hostname = ? WHEREuserID= ? INSERT INTO revisions VALUES (?,?…,?)

OLAP • On-line Analytical Processing: • Long running queries. • Complex joins. • Exploratory queries. SELECTCOUNT(U.lastLogin), EXTRACT(month FROMU.lastLogin) AS month FROMuseracctAS U WHERE U.hostnameLIKE ‘%.gov’ GROUP BY EXTRACT(month FROMU.lastLogin) CMU SCS 15-415/615

Data Storage Models • There are different ways to store tuples. • We have been assuming the n-ary storage model this entire semester. CMU SCS 15-415/615

n-ary Storage Model userID - userID userID userName userName - userName userPass - userPass userPass hostname hostname - hostname lastLogin - lastLogin lastLogin • The DBMS stores all attributes for a single tuple contiguously in a block. NSM Disk Page CMU SCS 15-415/615

n-ary Storage Model B+Tree SELECT* FROMuseracct WHEREuserName= ? AND userPass = ? - userID - userName userPass - - hostname - lastLogin userID userName userPass hostname lastLogin userID userName userPass hostname lastLogin INSERT INTO useracct VALUES (?,?,…?) NSM Disk Page userID userName userPass hostname lastLogin CMU SCS 15-415/615

n-ary Storage Model SELECTCOUNT(U.lastLogin), EXTRACT(month FROMU.lastLogin) AS month FROMuseracctAS U WHERE U.hostnameLIKE ‘%.gov’ GROUP BY EXTRACT(month FROMU.lastLogin) userID userID userName userName userPass userPass hostname hostname lastLogin lastLogin userID userName userPass hostname lastLogin userID userName userPass hostname lastLogin X NSM Disk Page CMU SCS 15-415/615

n-ary Storage Model • Advantages • Fast inserts, updates, and deletes. • Good for queries that need the entire tuple. • Disadvantages • Not good for scanning large portions of the table and/or a subset of the attributes. CMU SCS 15-415/615

Decomposition Storage Model - userID userID userID userName userName userName - userPass - userPass userPass hostname hostname - hostname - lastLogin lastLogin lastLogin • The DBMS stores a single attribute for all tuples contiguously in a block. hostname hostname hostname hostname hostname hostname hostname hostname hostname hostname DSM Disk Page hostname hostname hostname hostname hostname userID hostname hostname hostname hostname hostname lastLogin userName userPass CMU SCS 15-415/615

Decomposition Storage Model SELECTCOUNT(U.lastLogin), EXTRACT(month FROMU.lastLogin) AS month FROMuseracctAS U WHERE U.hostnameLIKE ‘%.gov’ GROUP BY EXTRACT(month FROMU.lastLogin) hostname hostname hostname hostname hostname hostname hostname hostname hostname hostname DSM Disk Page hostname hostname hostname hostname hostname hostname hostname hostname hostname hostname CMU SCS 15-415/615

Decomposition Storage Model • Advantages • Reduces the amount wasted I/O because the DBMS only reads the data that it needs. • Better query processing and data compression (more on this later). • Disadvantages • Slow for point queries, inserts, updates, and deletes because of tuple splitting/stitching. CMU SCS 15-415/615

History • 1970s: Cantor DBMS • 1980s: DSM Proposal • 1990s: SybaseIQ (in-memory only) • 2000s: Vertica, VectorWise, MonetDB • 2010s: Cloudera Impala, Amazon Redshift, “The Big Three” CMU SCS 15-415/615

System Architectures • Fractured Mirrors • Partition Attributes Across (PAX) • Pure Columnar Storage CMU SCS 15-415/615

Fractured Mirrors • Store a second copy of the database in a DSM layout that is automatically updated. • Examples: Oracle, IBM DB2 BLU NSM DSM CMU SCS 15-415/615

PAX • Data is still stored in NSM blocks, but each block is organized as mini columns. userID userID userID userID userName userName userName userName userPass userPass PAX Disk Page userPass userPass hostname hostname hostname hostname lastLogin lastLogin lastLogin lastLogin CMU SCS 15-415/615

Column Stores • Entire system is designed for columnar data. • Query Processing, Storage, Operator Algorithms, Indexing, etc. • Examples: Vertica, VectorWise, Paraccel, Cloudera Impala, Amazon Redshift CMU SCS 15-415/615

Query Processing Strategies • The DBMS needs to process queries differently when using columnar data. • We have already discussed the Iterator Model for processing tuples in the DBMS query operators. CMU SCS 15-415/615

Iterator Model • Each operator calls next() on their child operator to process tuples one at a time. next SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 next p cname, amt ⨝ next acctno=acctno s amt>1000 CUSTOMER ACCOUNT CMU SCS 15-415/615

Materialization Model • Each operator consumes its entire input and generates the full output all at once. SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 p cname, amt ⨝ acctno=acctno s amt>1000 CUSTOMER ACCOUNT CMU SCS 15-415/615

Observations • The Iterator Model is bad with a DSM because it requires the DBMS to stitch tuples back together each time. • The Materialization Model is a bad because the intermediate results may be larger than the amount of memory in the system. CMU SCS 15-415/615

Vectorized Model • Like the Iterator Model but each next() invocation returns a vector of tuples instead of a single tuple. • This vector does not have to contain the entire tuple, just the attributes that are needed for query processing. CMU SCS 15-415/615

Vectorized Model • Each operator calls next() on their child operator to process vectors. next SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 next cname, amt M next ⨝ acctno=acctno acctno, amt M next acctno amt amt>1000 s CUSTOMER ACCOUNT CMU SCS 15-415/615

Virtual IDs vs. Offsets • Need a way to stitch tuples back together. • Two approaches: • Fixed length offsets • Virtual ids embedded in columns userID userName userPass hostname lastLogin userID userName userPass hostname lastLogin 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 Offsets Virtual Ids

Vectorized Model • Reduced interpretation overhead. • Better cache locality. • Compiler optimization opportunities. • AFAIK, VectorWise is still the only system that uses this model. Other systems use query compilation instead… CMU SCS 15-415/615

Compression Overview • Compress the database to reduce the amount of I/O needed to process queries. • DSM databases compress much better than NSM databases. • Storing similar data together is ideal for compression algorithms. CMU SCS 15-415/615

Naïve Compression • Use a general purpose algorithm to compress pages when they are stored on disk. • Example: 10KB page in memory, 4KB compressed page on disk. • Do we have to decompress the page when it is brought into memory? Why or why not? CMU SCS 15-415/615

Fixed-width Compression • Sacrifice some compression in exchange for having uniform-length values per attribute. Tuples are no longer aligned at offsets userID userID userName userName userPass userPass hostname hostname lastLogin lastLogin userID userName userPass hostname lastLogin 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 Original Data Variable-Length Compression Fixed-Length Compression 6 6 6 7 7 7 CMU SCS 15-415/615

Run-length Encoding • Compress runs of the same value into a compact triplet: • (value, startPosition, runLength) All tuples are sorted on this column. Reduces the # of triplets userID userID userID userID sex sex sex sex 0 0 0 0 M M (M,0,6) (M,0,3) (F,3,2) 1 M 1 (F,6,2) M 1 1 (M,5,3) M M 2 2 2 2 5 5 F 3 3 M 6 4 6 4 F M M 5 5 7 M 7 3 3 6 F 6 M Original Data Sorted Data Unsorted RLE Sorted RLE F 4 M 7 4 7 CMU SCS 15-415/615

Delta Encoding • Record the difference between successive values in the same column. time temp time temp time temp 12:00 99.5 12:00 99.5 12:00 99.5 12:01 99.4 +1 -1 (+1,7) -1 +1 0 0 12:02 99.5 0 0 0 12:03 99.6 +1 +1 +1 1 1 1 12:04 99.6 +1 +1 +1 2 2 2 +1 0 0 12:05 99.5 3 3 3 12:06 99.4 +1 -1 -1 4 4 4 +1 0 0 12:07 99.5 5 5 5 6 6 6 Delta Encoding Delta+RLE Original Data 7 7 7 CMU SCS 15-415/615

Bit-Vector Encoding • Store a separate bit-vector for each unique value for a particular attribute where an offset in the vector corresponds to a tuple. userID sex M → 111 0 0 111 userID sex F → 0 0 0 11 0 0 0 0 0 M 1 M 1 A ‘1’ means that the tuple at that offset has the bit-vector’s value M 2 2 3 3 F 4 4 F M 5 5 6 6 M Original Data Bit-Vector Compression 7 M 7 CMU SCS 15-415/615

Dictionary Compression • Replace frequent patterns with smaller integer codes. • Need to support fast encoding and decoding. • Need to also support range queries. CMU SCS 15-415/615

Dictionary Compression • Construct a separate table of the unique values for an attribute sorted by value. SELECT * FROM users WHERE name LIKE ‘Tru%’ SELECT * FROM users WHERE name BETWEEN70AND80 code userId name userId name value 101 Truman 101 70 Bush 10 102 Obama 102 50 Carter 20 103 Bush 103 10 Ford 30 0 0 104 Reagan 104 60 Nixon 40 1 1 105 Trump 105 80 Obama 50 2 2 106 Nixon 106 40 Reagan 60 3 3 107 Carter 107 20 Truman 70 4 4 108 Ford 108 30 Trump 80 5 5 Original Data Compressed Data 6 6 7 7

Dictionary Compression • A dictionary needs to support two operations: • Encode: For a given uncompressed value, convert it into its compressed form. • Decode: For a given compressed value, convert it back into its original form. • We need two data structures to support operations in both directions. CMU SCS 15-415/615

Summary • Some operator algorithms can operate directly on compressed data • Saves I/O without having to decompress! • Difficult to implement when the DBMS uses multiple compression schemes. • It’s generally good to wait as long as possible to materialize/decompress data when processing queries… CMU SCS 15-415/615

Bifurcated Architecture • All txns are executed on OLTP database. • Periodically migrate changes to OLAP database. OLTP OLAP Data Warehouse OLTP Extract Transform Load OLTP CMU SCS 15-415/615

Modifying a Column Store • Updating compressed data is expensive. • Updating sorted data is expensive. • The DBMS will store updates in an staging area and then apply them in batches. • Have to make sure that we execute queries on both the staging and main storage. CMU SCS 15-415/615

Delta Store • Stage updates in delta store and periodically apply them in batches to the main storage. • Examples: Vertica, SAP HANA Delta Main CMU SCS 15-415/615

HTAP • Hybrid Transaction-Analytical Processing • Single database instance that can handle both OLTP workloads and OLAP queries. • Row-store for OLTP • Column-store for OLAP • Examples: SAP HANA, MemSQL, HyPer, SpliceMachine, Peloton, Cloudera Kudu (???) CMU SCS 15-415/615

Conclusion • If you’re running OLAP queries, you need to be using a column store. • Don’t let anybody try to tell you otherwise. CMU SCS 15-415/615

Rest of the Semester • Mon Nov 28th– Column Stores • Wed Nov 30th – Data Warehousing + Mining • Mon Dec 5th– SpliceMachineGuest Speaker • Wed Dec 7th –Review + Systems Potpourri http://cmudb.io/f16-systems CMU SCS 15-415/615

Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications