340 likes | 493 Views
Memory –efficient Data Management Policy for Flash-based Key-Value Store. Wang Jiangtao 2013-4-12. Outline. Introduction Related work Two works BloomStore [MSST2012] TBF[ICDE2013] Summary. Key-Value Store. KV store efficiently supports simple operations: Key lookup & KV pair insertion
E N D
Memory –efficient Data Management Policy for Flash-based Key-Value Store Wang Jiangtao 2013-4-12
Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary
Key-Value Store • KV store efficiently supports simple operations: Key lookup & KV pair insertion • Online Multi-player Gaming • Data deduplication • Internet services
Overview of Key-Value Store • KV store system should provide high access throughput (> 10,000 key lookups/sec) • Replaces traditional relational DBs for its superior scalability & performance. • prefer to use KV store for its simplicity and better scalability • Popular management (index + storage) solution for large volume of records – often implemented through an index structure, mapping Key-> Value
Challenge • To meet high throughput demand, the performance of index access and KV pair (data) access is critical • index access : search the KV pair associated with a given “key” • KV pair access: get/put the actual KV pair • Available memory space limits the maximum number of stored KV pairs • Using in-RAM index structure can only address index access performance demand
DRAM must be Used Efficiently • 1 TB of data • 4 bytes of DRAM for key-value pair 32 B( Data deduplication) => 125 GB! Index size(GB) 168 B(Tweet) => 24 GB 1 KB(Small image) => 4 GB Per Key-value pair size (bytes)
Existing Approach to Speed up Index & KV pair Accesses • Maintain the index structure in RAM to map each key to its KV pair on SSD • RAM size can not scale up linearly to flash size • Keep the minimum index structure in RAM, while storing the rest of the index structure in SSD • On-flash index structure should be designed carefully • Space is precious • random writes are slow and bad for flash life (wear out)
Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary
Bloom Filter • Bloom Filter利用位数组表示一个集合,并判断一个元素是否属于这个集合。初始状态时,m位的位数组的每一位都置为0,Bloom Filter使用k个相互独立的哈希函数,它们分别将集合中的每个元素映射到{1,…,m}的范围中。对任意一个元素x,第i个哈希函数映射的位置hi(x)就会被置为1(1≤i≤k)。注意,如果一个位置多次被置为1,那么只有第一次会起作用,后面几次将没有任何效果。 • 错误率 • Bloom Filter参数选择 • 哈希函数的个数k、位数组大小m、元素的个数n • 降低错误率
FlashStore[VLDB2010] • Flash as a cache • Components • Write buffer • Read cache • Recency bit vector • Disk-presence bloom filter • Hash table index • Cons • 6 bytes of RAM per key-value pair
SkimpyStash[SIGMOD2011] • Components • Write buffer • Hash table • Bloom filter • using linked list • a pointer to the beginning of the linked list of flash • Storing the linked lists on flash • Each pair have a pointer to earlier keys in the log • Cons • Multiple flash page reads for a key lookup • High garbage collection cost
Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary
Introduction • Key lookup throughput is the bottleneck for data application • Keep an in-RAM large-sized hash table • Move index structure to secondary storage(SSD) • Expensive random write • High garbage collection cost • Bigger storage space
BloomStore • BloomStore Design • An extremely low amortized RAM overhead • Provide high key lookup/insertion throughput • Componets • KV Pair write buffer • Active bloom filter • a flash page for write buffer • Bloom filter chain • many flash pages • Key-range partition • a flash “block” BloomStore architecture
KV Store Operations • Key Lookup • Active Bloom filter • Bloom filter chain • Lookup cost
Parallel lookup • Key Lookup • Read the entire BF chain • Bit-wise AND resultant row • High read throughput h1(ei) h1(ei) ... h1(ei) Bit-wise AND eiis found Bloom filters in parallel
KV Store Operations • KV pair Insertion • KV pair Update • Append a new key-value pair • KV pair Deletion • Insert a null value for the key
Experimental Evaluation • Experiment setup • 1TB SSD(PCIe)/32GB(SATA) • Workload
Experimental Evaluation • Effectiveness of prefilter • Per KV pair is 1.2 bytes • Linux Workload • Vx Workload
Experimental Evaluation • Lookup Throughput • Linux Workload • H=96(BF chain length) • m=128(the size of a BF) • Vx Workload • H=96(BF chain length) • m=64(the size of a BF) • A prefilter
Motivation • Using flash as a extension cache is cost-effective • The desired size of RAM-cache is too large • Caching policy is memory-efficient • Replacement algorithm achieves comparable performance with existing policies • Caching policy is agnostic to the organization of data on SSD
Defects of the existing policy • Recency-based caching algotithm • Clock or LRU • Access data structure and index
Defects of the existing policy • Recency-based caching algotithm • Clock or LRU • Access data structure and index
System view • DRAM buffer • An in-memory data structure to maintain access information (BF) • No special index to locate key-value pair • Key-value store • Provide a iterator operation to traverse • Write through BF Key-Value cache prototype architecture
Bloom Filter with deletion(BFD) • BFD • Removing a key from SSD • A bloom filter with deletion • Resetting the bits at the corresponding hash-value in a subset of the hash functions X1 Delete X1
Bloom Filter with deletion(BFD) • Flow chart • Tracking recency information • Cons • False positive • polluting the cache • False negative • Poor hit ratio
Two Bloom sub-Filters(TBF) • Flow chart • Dropping many elements in bulk • Flip the filter periodically • Cons • Keeping rarely-accessed objects • polluting the cache • traversal length per eviction
Traversal cost • Key-Value Store Traversal • unmarked on insertion • marked on insertion • longer stretches of marked objects • False positive
Evaluation • Experiment setup • two 1 TB 7200 RPM SATA disks in RAID-0 • 80 GB FusionioDrive PCIE X4 • a mixture of 95% read operations and 5% update • Key-value pairs:200 million(256B) • Bloom filter • 4 bits per marked object • a byte per object in TBF • hash function:3
Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary
Summary • KV store is particularly suitable for some special applications • Flash will improve the performance of KV store due to its faster access • Some index structure need to be redesign to minimize the RAM size • Don’t just treat flash as disk replacement