260 likes | 484 Views
Memory Buddies: Exploiting Page Sharing for Smart Colocation in Virtualized Data Centers. Written by : Timothy Wood, Gabriel Tarasuk -Levin, Prashant Shenoy , Peter Desnoyers , Emmanuel Cecchet , Mark D. Corner. Presenter : Yinon Avraham Advisor : Assoc. Prof. Danny Raz
E N D
Memory Buddies: Exploiting Page Sharing forSmart Colocation in Virtualized Data Centers Written by: Timothy Wood, Gabriel Tarasuk-Levin, PrashantShenoy, Peter Desnoyers, Emmanuel Cecchet, Mark D. Corner Presenter: Yinon Avraham Advisor: Assoc. Prof. Danny Raz Technion - Israel Institute of Technology
Agenda Location, location, location… • Introduction • Background and System Overview • Memory Fingerprinting • Sharing-aware Colocation • Hotspot Mitigation • Implementation • Experimental Evaluation • Conclusions (Placement)
1 Introduction • Data centers are increasingly employing virtualized architecture – applications run inside virtual servers which resides on a physical server • Hypervisor – responsible to allocate physical resources for the VMs on a physical server • Modern hypervisors use the CBPS technique (Content-Based Page Sharing) to reduce the memory footprint of resident VMs • The problem: Select which VMs should be colocated onto each physical host within the data center so that page sharing can be maximized • The suggested solution: Memory Buddies– a system for intelligent VM colocation within a data center to aggressively exploit page sharing benefits
2.1 Background • The use of CBPS in a hypervisor:VM1: M1unique pages VM2: M2 unique pagesS – common pages across VM1 & VM2Page sharing reduces the memory footprint to: M1 + M2 - S Physical Server 1 M2 M1 S S
2.1 Background (cont.) • Problem formulation: VM colocation problem is one where each VM is colocated with a set of other “similar” VMs with the most redundant pages. Several instantiations of the smart colocation problem arise during: • Initial placement • Server consolidation • Offline planning S S S S VM3 VM1 VM2 Physical Server 1 Physical Server 2 ? ?
2.2 System Overview • Memory Buddies detects sharing potential between virtual machines and then uses the low-level sharing mechanisms to realize these benefits • Nucleus – runs on each server, calculates: • Memory fingerprint for each VM • Aggregate memory fingerprints – the union of the VMs on the server • Control Plane – runs on a distinguished control server, Responsible for virtual machine placement and hotspot mitigation
3 Memory Fingerprinting3.1 Fingerprinting generation • In Memory Buddies, Hsieh’s SuperFastHash algorithm is used to generate 32bit hashes for each 4KB page • The set of unique page hashes for a VM’s pages are gathered in a sorted order to generate the raw memory fingerprint • Such a fingerprint may be compared against another VM or server to indicate the potential memory sharing between them • Cons: • These fingerprints are large – 1MB for each 1GB of VM address space • They need to be sorted in order to be compared efficiently
3 Memory Fingerprinting3.2 Succinct Fingerprints • Bloom filter - a lossy representation of a set of keys, which may be used to test a value for membership in that set with configurable accuracy. Consists of: • m-bit vector • k hash functions: H = h1, h2, … hk For each element a: H(a) = 1 if a is part of the set • The probability (pe) of such errors depends on m, k, and the probability that any bit in the vector is 1.n – number of elements stored Does the other way always true? No – false positive result
3 Memory Fingerprinting3.2 Succinct Fingerprints • In practice, neither method will produce an accurate prediction because: • Fingerprints are snapshots in time, while memory contents changes • There is a difference between what could be shared and what the hypervisor identifies as sharable
3 Memory Fingerprinting3.3 Fingerprint Comparison • Page sharing potential = intersection of fingerprints(Takes the errors underconsideration)z1, z2 – number of zeros in the two Bloom filtersz12 – number of zeros in the AND resultm – size of the filter vectorsk – number of hash functions
3 Memory FingerprintingConclusion • Benefits of the succinct fingerprints (vs. hash lists): • Much smaller • Save communication bandwidth • Much faster to compare • Require no sorting before comparison
4 Sharing-aware Colocation • Each nucleus computes & transmit fingerprints to the control plane • The control plane has an overview of all the data center • The control plane uses this knowledge in order to maximize page sharing potential • There are 3 supported types of placement decisions: • Initial Placement • Server Consolidation • Offline Planning
4 Sharing-aware Colocation4.1 Initial placement • The goal:Deploy the new VM while allowing more VMs to be hosted on a given number of servers (the greatest amount of sharing) • The algorithm: • Place the new VM on a staging host – collect usage information (memory, CPU, network, disk) • Determine set of feasible hosts (with sufficient resources) • Estimate sharing potential (new VM on each feasible host) • Choose host with the maximum sharing potential
4 Sharing-aware Colocation4.2 Server Consolidation • The goal:Pack VMs onto servers so as to reduce aggregate memory footprint and maximize the number of VMs that can be housed in the data center. (Save energy, reduce servers’ weariness) • The algorithm phases: • Identify servers to consolidate(mean memory usage under a threshold) • Determine target hosts(start with the largest VM, same as initial placement) • Migrate VMs to targets(live migration, limit the number of concurrent migrations)
4 Sharing-aware Colocation4.2 Server Consolidation VM 1 VM 3 VM 3 Capacity VM 4 VM 1 VM 2 Server 1 Server 2 Server 3
4 Sharing-aware Colocation4.3 Offline Planning Tool for Smart VM Colocation • The goal: Answer the question: “What if?” • Input: • Data centers and their resource capacities • Resource utilization statistics • Memory fingerprints • Output: • VM placements that match each VM to a host • total memory consumption • Expected rate of sharing • This problem analogous to a bin packing problem where the resource constraints define the size of each bin. Memory Buddies uses a dynamic programming technique to solve it.
5 Hotspot Mitigation • Goal: Resolve memory pressure caused by changes in the VM behavior, mitigate the effect by re-balancing the load over the hosts • Memory hotspot reasons: • Increasing demand for memory by one or more VMs (application and/or OS) • Loss of page sharing • Monitoring: • Level of swap activity (VM’s OS) • Number of shared pages (hypervisor) • Solution: • Detect hotspot (swap activity rises, shared pages decreases) • Resolve by re-distributing the VMs – use Initial Placement, start with the VM which provides the highest absolute gain in sharing • No feasible destinations bring a server to life
6 Implementation • Virtualization layer: VMware ESX • Supports VM migration • Supports page sharing, but it is unavailable the nucleus is deployed on each VM (memory tracing kernel module), not as a part of the hypervisor • TestbedA cluster of P4 2.4GHz servers connected over gigabit ethernet • Memory TracerMemory analysis tool (Linux, Windows, Mac OS X), generates 32bit hashes for each page in the memory. The resulting fingerprints are sent to the control plane every few minutes. • Control PlaneJava based server, communicates with the VMware Virtual Infrastructure management console via WS-API, in order to gather VMs’ information & statistics, and to initiate migrations
7 Experimental EvaluationMemory Trace Analysis • Result: 37% of the pages can be shared with one or more systems
7 Experimental EvaluationCase Study: Internet Data Center • Result: Effective capacity of the data center increased by 16%
7 Experimental EvaluationHotspot Mitigation • Result: Hotspot detected mitigated by determining a different host with a higher sharing potential
7 Experimental EvaluationCase Study: Desktop Virtualization • Result: Memory Buddies can be used offline to compute memory sharing and answer “what if” questions when planning for desktop virtualization
7 Experimental EvaluationFingerprint Efficiency and Accuracy • Result: Employing Bloom filters in large data centers can reduce sharing estimation time by an order of magnitude and can reduce network overheads by over 90%, while still maintaining a high degree of accuracy
7 Experimental EvaluationSub-page sharing • Break each page into n chunks, map each chunk to a 32bit hash. • Result: The detected sharing increases • The sharing between two 64bit Ubuntu Linux, 2GB of RAM
8 Conclusions • Modern hypervisors use CBPS to reduce the footprint of the residing VMs (by ~33%) • This technology can be used to intelligently colocate VMs on hosts, decreasing the total footprint, hence reduction of the TCO. (Experiments results show an increase of effective capacity of the data center by 16%) • Memory Buddies suggest a solution for 3 types of placement decision: • Initial Placement • Server Consolidation • Offline Planning As well a hotspot mitigation technique