Scaling up analytical queries with column -stores

Scaling up analytical queries with column-stores Ioannis Alagiannis Manos Athanassoulis Anastasia Ailamaki ÉcolePolytechniqueFédérale de Lausanne

Drinking from a data firehose • Fast and high quality data analysis for smart business decisions • Data warehouses • 1/3 of the database market ($$$) • Column-storesare here to stay! • Need for multiple concurrent users • 100s to 1000s queries* Many concurrent queries + column-stores = ??? *"High-performance data warehousing", TDWI best practices report

Multiple concurrent queries pasta? steak? vegan? Find all restaurants with rating over 3.5 and close to East Village indian? DBMS CORE 2 CORE 2 CORE 1 CORE 1 CORE 4 CORE 4 CORE 3 CORE 3 CORE 5 CORE 5 CORE 6 CORE 6 CORE 8 CORE 8 CORE 7 CORE 7 MEM HDD High contention for resources

response time throughput

Throughput (memory-resident workload) TPCH (sf:30) saturation point total #HW contexts Concurrency can hurt performance

Experimental setup • Column stores • System-A and System-B (Commercial) • System-C (Open-source) • Hardware • Dual socket Intel(R) Xeon(R) CPU E5-2660 • 2 sockets x 8 cores x 2 threads (32 HW contexts) • 128 GB RAM, 1600 MHz DIMMs • L1: 64KB and L2: 256KB (per core), L3: 20MB (shared)

Workloads • TPC-H • Scale factor: 30 (32GB on disk) • Qtpch = {10 query templates} • SSB (Star Schema Benchmark) • Scale factor: 30 (18GB on disk) • Qssb= {all of 13 query templates} • Throughput exp. with 25 queryinstances Memory-resident Hot-runs

Experiment 1: How does increased concurrency affect response time?

Scaling up TPCH Q1 Linear increase in response time

Scaling up SSB Q3.1 Similar behavior in SSB

Experiment 2: What is the variability of query response time?

Variability of System-A TPCH (64 clients) Groups of short, medium and long running queries

Variability of System-B TPCH (64 clients) Balanced resource allocation  lower variation

Variability of System-C TPCH (64 clients) System-C uses an admission control mechanism

Experiment 3: How does increasing concurrency affect throughput?

Throughput - TPCH 48% 32% drop 35% drop Throughput decreases after the saturation point

Throughput - SSB throughput plateaus Exploiting sharing  sustain peak performance

When concurrency in column-stores is increased: • Response time increases linearly • … with high variability • After saturation peak performance is not sustained Except from System-B for SSB

Where do we go from here? • QPipe, Datapath, CJoin, ShareDB, Blink • Recycler (MonetDB), cooperative scans, CCM (cracking) • Adaptive resource (re)allocation • Work sharing techniques • Contention-aware scheduling saturation point Thank you!

Scaling up analytical queries with column -stores