Cost-effective Hybrid Storages

Cost-effective Hybrid Storages Flash Group, Cao Qingling

Motivation • Dry up things with the least money！

Motivation • High cost, low density, low reliability. • Replacement as HDD is not recommended, especially when the data volume is very large. • Cache as HDD, make up the gap between RAM and HDD. Hurt the lifetime. • A permanent store at the same level as HDD, store some special data.

Introduction • Put forward a Hybrid Hbase with SSD. • Storing system component of Hbase in SSD, which at the same level as HDD. • Perform quantitative assessment, Hybrid Hbase perform 1.5-2 times better.

HBase • Column-based key-value store. • Each region server has a write-ahead log(WAL). • First write WAL and then in-memory memstore. • Region is a horizontal division. • A region could split. • Data on disks is stored as Log-structured merge(LSM) trees.

HBase System Component Zookeeper: • Clients contact it for -ROOT- table. • Master contacts it to know available region servers. • Region servers contact with it in a heartbeat keep- alive mechanism. • Zookeeper is I/O intensive. Catalog Tables: • -ROOT- and .META. Tables. • Mostly read intensive and are not updated frequently.

HBase System Component Write-ahead-log(WAL): • Any write is first done on the WAL. • The size grows with: i) WAL committed; ii) write rate; iii) the size of key-value pair. Temporary Storage: • Used when a region is split or merged. • Sequentially read or write.

Assessment Price: 1:10 1% of the database size. Gain more than 10% performance.

Experimental Evaluation • Experiment: Intel processor(4 cores and 4 threads at 3 GHz) with 8 GB RAM, Western Digital 1TB HDD, Kingston 128 GB SSD. • Yahoo! Cloud Serving Benching(YCSB). • Workloads: 100w queries on database with 6000w records. Record size is 1KB. Totally 72 regions.

Experimental Evaluation

Introduction • Approximate membership query data structure(AMQ). Bloom Filter. • Larger than RAM, performance decays. • Quotient Filter：better data locality, squential operations, available delete, dynamically resized, space-saving. • Buffered Quotient Filter(BQF) and Cascade Filter(CF) designed for flash.

Quotient Filter • fr = f mod 2r • fq = • T[fq] = fr • Fingerprint: f = fq2r + fr.

Quotient Filter • is_occupied: check if fq = i, namely if T[i] has data. • is_shifted: if fr belongs to slot i. • is_continuation: if blongs to the same run with i-1. run Physical Storage

Quotient Filter • Check if f in the QA: step1: step2: to the beginning of the cluster. step3: to the start of the run. step4: search f. • Insert a f. • Delete a f.

Quotient Filters on Flash • Buffered Quotient Filter - BQF: one QF as the buffer, another on SSD. - Optimized for lookup performance. • Cascade Filter - Optimized for insertion. - Offer a trade off between lookup and insertion.

Quotient Filter on Flash • Cascade Filter - Based on cache-oblivious lookaheadarrary(COLA).

Evaluation

Conclusions • Bloom Filter has wide use in key-value storage. • Change the way of thinking. • Gain inspiration from traditional algorithms of database. • Design corresponding hybrid system by applications.

Bloom Filter • Initial state: • Insert: H(1), H(b). • Can not expend, support no delete, poor data locality. Back

Cost-effective Hybrid Storages

Cost-effective Hybrid Storages

Presentation Transcript

Cost-Effective Online Research

Cost Effective Drainage Structures

Cost Benefit Analysis – Hybrid Cars

Cost effective recruitment

Cost effective marketing

“Cost Effective ACA Solutions”

Cost-effective Hybrid Storages

Cost-Effective Legal Research

Cost Effective Evaluation Strategies

Magnetic Data Storages

Cost Effective Cars

Cost effective marketing

A Hybrid Architecture for Cost-Effective On-Demand Media Streaming

Chapter 5 Secondary Storages

Cost-Effective Software Deployment

Cost Effective Dispute Resolution

Cost-Effective Usability Evaluation

Cost Effective Simplification

Cost Effective Temporary Fences:

Hybrid System: A Cost-Effective Solution for Electricity

Web Werks Cost effective hybrid cloud - Flexible and Scalable.

Cost Benefit Analysis – Hybrid Cars