Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration Xiang Zhang 1,2, Zhigang Huo 1, Jie Ma 1, Dan Meng 1 1. National Research Center for Intelligent Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences 2. Graduate University of Chinese Academy of Sciences Email: zhangxiang@ncic.ac.cn

Outline • Introduction • Design • Implementation • Evaluation • Conclusion & Future Work

Live Migration • Definition • Migrating OS and Apps as a whole to another physical machine without rebooting the VM • Advantages • Load Balance • Services Consolidation • Fault Tolerance • ... • Usually a shared storage is deployed • Migrating VCPU context and memory image

Pre-Copy • Pre-Copy is the default choice in Xen • First phase, initial memory pages are copied • Second phase, several rounds of incremental synchronization are employed • Last phase, VM is suspended, remaining memory image and VCPU context are copied • Pre-Copy is reliable

Motivation of Research • Performance metrics of migration • Total Data Transferred • Total Migration Time • Downtime • Necessity for improving performance of Migration • Apps suffer less time of performance degradation • Would not miss many migration opportunities • Shorter downtime for latency-sensitive Apps

Analyzing Migration Data Regularities • During the first phase • Zero pages are in the majority for lightweight workloads • At least 25% of non-zero pages are identical or above 80% similar • Ratios of identical and similar pages to reference pages are 8:1 at least • During last two phases • Little zero pages • At least 50% of pages are above 80% similar to their old versions • Conclusion • Too many redundant data transferred during migration. • Migration with Data Deduplication (MDD)

How to Find Identical and Similar Pages (1) • HashSimilarityDetector(k, s, c) [21] • Hashes (k * s) blocks on the page, and groups them into k groups of s hashes each • For each hash fingerprint, c candidates are stored as reference pages • HashSimilarityDetector(2, 1, 1), SuperFastHash of 64-byte blocks

How to Find Identical and Similar Pages (2) • Similarity is transitive • Ptrans≈ Pold, Phash≈ Ptrans, so Phash≈ Pold • Need not to cache all the transferred pages • Only the privileged domain in source needs to maintain hash table • Reference pages are transferred and can be found by their frame numbers in destination

How to Find Identical and Similar Pages (3) • Only indexing by hash fingerprints may cause data inconsistency Source Destination FPHash b1 b2 Px b1 Px-old Px-old Px-new b2 b1 Py b1 Px-new b2

How to Find Identical and Similar Pages (4) • Double-Hash to eliminate data inconsistency Source Destination FNHash Px b2 b1 b2 Px-new b1 Px-old Px-old b1 Px-new b2 FPHash Py b1

Data Deduplication during Migration • In source • Pparity= Ptrans ⊕ Pref • Encoding Pparity with RLE, then migrating • In destination • Decoding to get Pparity • Ptrans = Pparity ⊕ Pref • Advantages • Pparity contains less information than Ptrans • Reflects the exact different data at bit level • Contains many blocks of continuous zeros, even RLE can compress effectively • RLE is one of the fastest encoding algorithm

Implementation • Do data deduplication parallelly by multi-thread • Hash tables are maintained by LRU • Extended memcmp() to reduce the overhead of judging zero pages

Experimental Setup • Experiment platform • Cluster composed by six identical servers • One storage server, iSCSI protocol, isolated gigabit Ethernet • Two servers, which act as the source and destination of migration • Three servers work as clients for workloads • Server configuration • Two Intel Xeon E5520 quad-core CPUs, 2.2GHz • 8GB DDR RAM • Gigabit LAN • Xen-3.3.1 and modified Linux-2.6.18.8 • Migrated VM is configured with one VCPU and 1GB RAM • Migration shares the same network with workloads. • Workloads • Compilation, VOD, static web server, dynamic web server

Total Data Transferred • Transferred data is reduced by 56.60% on average • Number of transferred pages is reduced by 48.73% on average (Banking) • Compression ratio is 49.27% on average (Banking)

Total Migration Time and Downtime • MDD decreases total migration time and downtime by 34.93% and 26.16% on average • Less data transferred • Number of migration rounds are not reduced

CPU Resource Required • Extra CPU resource which MDD requires is 47.21% of a CPU Average CPU Utilization Ratio of Migration (%)

Influence to Apps • Run Apache in migrated VM, and migrate it in normal and adaptive mode respectively • The more limited network bandwidth is, the more essential data deduplication is Benefits of MDD in Different Migration Mode (%)

Outline • Introduction • Design • Implementation • Evaluation • Conclusions & Future Work

Conclusion & Future Work • Conclusion • Study the characteristics of run-time memory image data during migration • Present the design and implementation of MDD • MDD reduces total data transferred, total migration time and downtime by 56.60%, 34.93% and 26.16% respectively, reduces the influence of migration to Apps. • Future work • Extend MDD into live whole-system migration in wide-area environment

Thank You! Any Questions?

Related Work • Reducing transferred data • Post-Copy [7][12] • Self-Ballooning [7] • Trace and replay [13] • Adaptive compression [8] • Improving network bandwidth • InfiniBand RDMA [14]

Backup Ecommerce Support

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration

Presentation Transcript

From Zero to Live Migration

Live Migration of Virtual Machines

Live Migration of Virtual Machines

Virtual Machine

Real Time Issues in Live migration of Virtual Machines

Predicting The Performance Of Virtual Machine Migration

Live Migration of Virtual Machines

Live Migration of Virtual Machines

Live Migration of Virtual Machines

Data Deduplication in Virtualized Environments

Post-Copy Live Migration of Virtual Machines

Virtual Machine

Virtual machine

Live migration of Virtual Machines

Parallelizing Live Migration of Virtual Machines

Virtual Machine

Real Time Issues in Live migration of Virtual Machines

Virtual Machine

DATA DEDUPLICATION

Virtual Machine