An Evaluation of Using Deduplication in Swappers

An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng

Motivation • Deduplication detects duplicate pages in storage • NetApp, Data Domain: billion $ business • We explore another direction: use deduplication in swappers • Our experimental results indicate that using deduplication in swappers is beneficial

What is a swapper? • A mechanism to expand usable address spaces • Swap out: swap a page in memory to swap area • Swap in: swap a page in swap area to memory • Swap area is on disk Used P1 Free P1 pte’

Why deduplication is useful? • Writes to disk is slow • Disk accesses is much slower than memory! • When duplicate pages exist: • Do we really need to swap out all of them? • If a duplicate page appear in swap area, we can save one I/O. P1 P2 P3 P1

Architecture Swap out A page Compute checksum Lookup in the dedup cache YES Skip pageout NO pageout Add to dedup cache

Computing Checksum • SHA-1 checksum (160bit) • Collision probability of one in 280 • Only use the first 32bit (one in 216) • Related to the implementation of dedup cache • Only store checksum • We assume two pages are identical if their checksums are equal • Trade consistency for performance

Dedup Cache • Dedup cache - radix tree • Checksum -> dedup_entry_t • A Trie with O(|key|) lookup and update overhead • Well written in the kernel • Key in radix tree is 32 bits • We only keep the first 32 bits of a checksum as key

Entries in Dedup Cache • The index of a page in swap area • The number of duplicates pages given a checksum • A lock for consistency typedef struct { swp_entry_t base; atomic_t count; spinlock_t lock; }dedup_entry_t;

Changes to Linux Kernel • Swap cache • swap_entry_t ->page • Avoid repeatedly swapping in • Happens when a page swapped out is shared by multiple processes • Example • Process A and B share the page P • P is swapped out, PTE in A and B are updated • A wants to access P • B wants to access P

Will dedup cache grows infinitely? • Swap Counter for each swap_entry_t • # of reference in the memory • counter++ when • one more pte contains swap_entry_t • It’s in swap cache • It’s in dedup cache • counter-- when swap in a page • remove swap_entry_t from dedup cache and swap cache when counter = 2

Reference Counters Swap area (4) (2) A Swap cache B dedup cache

Changes to Swap Cache • Maintain the mapping between swap_entry and page • We change that mapping to swap_entry and a list of pages of same contents • Why we need a list?

Possible Inconsistency • Swap out page P1 to swap_entry e1 • Swap out page P2, a duplicate of P1 • The mapping of e1->P2 can not be added to swap cache • Swap in P1: mapping is deleted • Swap in P2: Ooops! E1 -> P1 Swap Cache

Our Solution • Swap out page P1 to swap_entry E1 • Swap out page P2, a duplicate of P1 • The mapping of e1->P2 is added to the list • Swap in P1: only P1 is deleted • Swap in P2: delete E1->P2 E1 -> P2 E1 -> P1,P2 E1 -> P1 Swap Cache

Experimental Evaluation • We run our experiment on VMWare with Linux 2.6.26 • Our testing program: sequentially access an array • Each element is of size 4KB • We change the percentage of duplicate pages in that array

All of the pages are duplicates • Duplication significantly reduces the access time

No Duplicate Pages • However, duplication also incurs a significant overhead

Overheads in Deduplication • Major overheads: • Calculating checksums: 35 us • When a page is swapped in or swapped out, we all calculate the checksums. • Maintain the reference counter • Explicitly require locks impose significant overhead: average of 65 us in our experiments

Conclusion • Deduplication is a double-edged sword in swappers • When a lot of duplicate pages are presented, deduplication reduces the access time by orders of magnitude • When few duplicate pages are presented, the overhead is also non-negligible

An Evaluation of Using Deduplication in Swappers