200 likes | 395 Views
MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. Bin Fan, David G. Andersen, Michael Kaminsky. Presenter: Son Nguyen. Memcached internal. LRU caching using chaining Hashtable and doubly linked list. Goals. Reduce space overhead (bytes/key)
E N D
MemC3: Compact and Concurrent MemCache with DumberCaching and Smarter Hashing Bin Fan, David G. Andersen, Michael Kaminsky Presenter: Son Nguyen
Memcached internal • LRU caching using chaining Hashtable and doubly linked list
Goals • Reduce space overhead (bytes/key) • Improve throughput (queries/sec) • Target read-intensive workload with small objects • Result: 3X throughput, 30% more objects
Doubly-linked-list’s problems • At least two pointers per item -> expensive • Both read and write change the list’s structure -> need locking between threads (no concurrency)
Solution: CLOCK-based LRU • Approximate LRU • Multiple readers/single writer • Circular queue instead of linked list -> less space overhead
CLOCK example Originally: Read(kd): Write(kf, vf): Write(kg, vg):
Chaining Hashtable’s problems • Use linked list -> costly space overhead for pointers • Pointer dereference is slow (no advantage from CPU cache) • Read is not constant time (due to possibly long list)
Solution: Cuckoo Hashing • Use 2 hashtables • Each bucket has exactly 4 slots (fits in CPU cache) • Each (key, value) object therefore can reside at one of the 8 possible slots
Cuckoo Hashing HASH1(ka) (ka,va) HASH2(ka)
Cuckoo Hashing • Read: always 8 lookups (constant, fast) • Write: write(ka, va) • Find an empty slot in 8 possible slots of ka • If all are full then randomly kick some (kb, vb) out • Now find an empty slot for (kb, vb) • Repeat 500 times or until an empty slot is found • If still not found then do table expansion
Cuckoo Hashing a b Insert a: HASH1(ka) (ka,va) HASH2(ka)
Cuckoo Hashing Insert b: HASH1(kb) (kb,vb) c b HASH2(kb)
Cuckoo Hashing Insert c: HASH1(kc) c (kc,vc) HASH2(kc) Done !!!
Cuckoo Hashing • Problem: after (kb, vb) is kicked out, a reader might attempt to read (kb, vb) and get a false cache miss • Solution: Compute the kick out path (Cuckoo path) first, then move items backward • Before: (b,c,Null)->(a,c,Null)->(a,b,Null)->(a,b,c) • Fixed: (b,c,Null)->(b,c,c)->(b,b,c)->(a,b,c)
Cuckoo path Insert a: HASH1(ka) (ka,va) HASH2(ka)
Cuckoo path backward insert b a Insert a: HASH1(ka) (ka,va) HASH2(ka) c
Cuckoo’s advantages • Concurrency: multiple readers/single writer • Read optimized (entries fit in CPU cache) • Still O(1) amortized time for write • 30% less space overhead • 95% table occupancy
Evaluation 68% throughput improvement in all hit case. 235% for all miss
Evaluation 3x throughput on “real” workload
Discussion • Write is slower than chaining Hashtable • Chaining Hashtable: 14.38 million keys/sec • Cuckoo: 7 million keys/sec • Idea: finding cuckoo path in parallel • Benchmark doesn’t show much improvement • Can we make it write-concurrent?