Difference Engine: Harnessing Memory Redundancy in Virtual Machines by Diwaker Gupta et al.

Difference Engine: Harnessing Memory Redundancy in Virtual Machinesby Diwaker Gupta et al. presented by Jonathan Berkhahn

Motivation • Virtualization has improved and spread over the past decade • Servers often run at 5-10% of CPU Capacity • High capacity needed for peak workloads • Fault isolation for certain services • Certain services run best on particular configurations • Solution: Virtual Machines

Problem • CPU's suited to multiplexing, main memory is not • Upgrading not an ideal option • Expensive • Limited by slots on the motherboard • Limited by ability to support higher capacity modules • Consumes significant power, and therefore produces significant heat • Further exacerbated by current trends toward many-core systems

How do we fix this memory bottleneck for virtual machines?

Difference Engine • Implemented as an extension to the Xen VMM • Sub-page granularity page sharing • In-memory page compression • Reduces the memory footprint by up to 90% for homogenous workloads and up to 65% for heterogeneous workloads

Outline • Related Work • Difference Engine algorithms • Implementation • Evaluation

Page Sharing • Transparent page sharing • Requires guest OS modification • Content-based • VMWare ESX

Delta Encoding • Manber • Rabin fingerprints • Inefficient • Broder • Combined Rabin fingerprints and sampling • Both focused on identifying similar files, but not encoding the differences

Memory Compression • Douglis et al. • Sprite OS • Double-edge sword • Wilson et al. • Previous results due to slow hardware • Developed algorithms that exploit virtual memory structure

Page Sharing • Content-based • Hash pages and index by hash value • Hash collisions indicate a potential match • Compare byte-by-byte to ensure pages are identical • Reclaim one page, update virtual memory • Writes cause a page fault trapped by the VMM

Patching • Sharing of similar pages • Identify similar pages, store differences as a "patch" • Compresses multiple pages down to single reference copy and a collection of patches

Identifying Candidate Pages

Compression • Compression of live pages in main memory • Useful only for high compression ratios • VMM traps requests for compressed pages

Overview

Paging Machine Memory • Last resort • Copy pages to disk • Extremely expensive operation • Leaves policy decisions to end user

Caveat Both patching and compression are only useful for infrequently accessed pages. So, how do we determine "infrequent"?

Clock • Not-Recently Used policy • Checks if page has been referenced/modified • C1 - Recently Modified • C2 - Recently Referenced • C3 - Not Recently Accessed • C4 - Not Accessed for a While

Implementation • Modification to Xen VMM • Roughly 14,500 lines of code, plus 20,000 for ports of existing patching and compression algorithms • Shadow Page Table • Difference Engine relies on modifying the shadow page and P2M tables • Ignored pages mapped by Dom-0 • Complications: Real Mode and I/O support

Complications • Booting on bare metal disables paging • Requires paging to be enabled within guest OS • I/O • Xen hypervisor emulates I/O hardware with a Dom-0 process ioemu, which directly accesses guest pages • Conflicts with policy of not acting on Dom-0 pages • Unmap VM pages every 10 seconds

Clock • NRU policy • Tracked by Referenced and Modified bits on each page • Modified Xen's shadow page tables to set bits when creating mappings • C1 - C4

Page Sharing • Hash table in Xen heap • Memory limitations - 12 Mb • Hash table only holds entries for 1/5 memory • 1.76 Mb hash table • Covers all of memory in 5 passes

Detecting Similar Pages • Hash Similarity Detector (2,1) • Hash similarity table cleared after all pages have been considered • Only building the patch and replaced the page requires a lock • May result in a differently sized patch, but will still be correct

Compression & Disk Paging • Antagonistic relationship with patching • Compressed/Disk pages can't be patched • Delayed until all pages have been checked for similarity and the page has not been accessed for a while (C4) • Disk paging done by daemon running in Dom-0

Disk Paging

Evaluation • Experiments run on dual-processor, dual-core 2.33 GHz Intel Xeon, 4 KB page size • Tested each operation individually for overhead

Page Lifetime

Homogenous VMs

Homogenous Workload

Heterogeneous Workload

Heterogeneous Workload 2

Utilizing Savings

Conclusion • Main memory is a primary bottleneck for VMs • Significant memory savings can be achieved from: • Sharing identical pages • Patching similar pages • In-memory page compression • Implemented DE and showed memory savings of as much as 90% • Saved memory can be used to run more VMs

Discussion

Difference Engine: Harnessing Memory Redundancy in Virtual Machines by Diwaker Gupta et al.