390 likes | 404 Views
This research paper discusses the background and motives behind live migration (LM) in cloud computing and data centers. It introduces Virt-LM Benchmark as a solution to evaluate and compare LM performance among different hardware and software platforms. The paper outlines the goal, criteria, metrics, and workloads of the benchmark, emphasizing its significance in making informed decisions, designing better LM strategies, and choosing the right platform.
E N D
Live Migration(LM) Benchmark Research College of Computer Science Zhejiang University China
Outline Background and Motives Virt-LM Benchmark Overview Further Issues and Possible Solutions Conclusion Our Possible Work under the Cloud WG
Significance of Live Migration • Concept: • Migration: Move VM between different physical machines • Live: Without disconnecting client or application (invisible) • Relation to Cloud Computing and Data Centers: • Cloud Infrastructures and data centers have to efficiently use their huge scales of hardware resources. • Virtualization Technology provides two approaches: • Server Consolidation • Live Migration • Roles in a Data Center: • Flexibly remap hardware among VMs. • Balance workload • Save energy • Enhance service availability and fault tolerance
Motives of the LM Benchmark • Scale and frequency leads to a significant LM cost (TC): • S(Scale): How many servers? • Google: Estimated 200,000 to 500,000 servers, included in 36 data centers in 2008 • MS: Added 10,000 servers per month in 2008 • FaceBook: More than 30,000 servers in its data center in 2008 • F(Frequency):How often it happens? • Load balancing • Online maintainance and proactive fault tolerance • Power management • C(Cost of Live Migration): • Hardware and network bandwidth:save and transfer VM state • Workload performance: share hardware • Service availability: downtime
Motives of the LM Benchmark • A LM benchmark is in need. • LM benchmark helps make right decisions to reduce cost • Design better LM strategies • Choose better platform • Evaluation of a data center should include its LM performance • VMware released VMmark 2.0 for multi-server performance in DEC, 2010 • Existing evaluation methodologies have their limitations. • VMmark 2.x • Dedicated to the VMware’s platforms • A macro benchmark -- no spefic metrics about LM performance • Existing research on LM • ([Vee09 Hines], [HPDC09 Liu], [Cluster09 Jin], [IWVT08 Liu], [NSDI05 Clark], …) • All dedicated to design LM strategies • No unified metrics and workloads. Results are not comparable to each other. • Some critical issues are not mentioned. • Still lack of a formal and qualified LM benchmark
Goal and Criterias • Goal of Virt-LM Benchmark: • Compare LM performance among different hardware and software platform, especially in data center scenarios • Design Criteria: • Metric • Sufficient • Observable • Concise • Workload • Typical • Scalable • Scoring methodology • Impartial • Stability • Produce repeatable results • Compatibility • Usability Workloads platform platform platform … Metric Results Metric Results Metric Results
System Under Test • System Under Test(SUT): • Evaluation Target • Hardware and software platform • Including its VMM and the LM strategies it used Workloads SUT SUT SUT … Metric Results Metric Results Metric Results
Metrics • Metrics Sufficiency: • Cost : • migration overhead, • amount of migrated data (burden on network) • QoS: • downtime, • total migration time • migration overhead, • Metrics and Measurement: • Downtime • Def: how long the VM is suspended • Measure: ping • Total migration time • Def: how long a LM lasts • Measure: timing the LM command • Amount of migrated data • Def: how many data is transferred • Measure: transferred data on its exclusive TCP port • Migration overhead • Def: How much LM impaires performance of the workload • Measure: Declined percentage of the workloads’s score
Workloads • Representative to real scenarios • Where: • Data centers • When: • Load balancing • power management, • service enhancement and fault tolerate migrate VM VM … VM service OS Platform (HW and VMM)
Workloads • During a live migration, • VM could run different services • Mail Server • Application Server • File Server • Web Server • Database Server • Standby Server • Other VMs exist on the same platform • Heavy during load balancing • Light during power management • Random during service enhancement and fault tolerance • Happens at any moments (Migrations Points) migrate VM VM … VM service OS Platform (HW and VMM)
Workload Implementation • Internal workload types • Mail Server: SPECmail2008 • App Server: SPECjAppServer2004 • File Server: Dbench • Web Server: SPECweb2005 • Database Server: Sysbench • Standby Server: Idle VM • External workload types • Heavy: more VMs to fully utilize the machine • Increasing VMs until workload performances are undermined • Light: single VM on the platform External workload migrate VM VM … VM Internal Workload OS Platform (HW and VMM)
Migration Points Problem • During the run of a workload • LM happens at random time • Performance varies at different points workload: 483xalancbmk of SPECcpu2006 • How to fully represent a workload’s performance variety? • Test as many migration points,spreading the whole run of a workload
Migration Points Problem • Problem • too many points prolong the test significantly • Soution • More sample results in each run • Only a few runs • Implementation • Divide a workload’s runtime into many time sectors • Each time sector is longer than total migration time • Migrate at the startpoint of each sector First run Second run Third run
Scoring Method • Goal: compute an overall score • Each metric i,compute a final score Si • Normalize each result (Pij) using reference system(Rij) • Sum up results of all workloads: • Si of reference system is always 1000: • Lower Score indicates higher performance • Open Problem: merge the 4 metrics’ Si • Different property,different variation • Simply adding up is not appropriate • Current implementation in Virt-LM: Final result have 4 scores
Other Criterias • Usability • Easy to configure • VM images Provided • Workloads pre-installed • Easy to run • Automatically managed after launch • Compatibility • Successful on Xen and KVM • Scalable workload: Fully utilize the hardware • Heavy enough macro workload • Live migration lasts a long time. • (Multiple live migration) • more than one are migrated concurrently
BenchmarkComponents • Logical components • System Under Test • Migration Target Platform • VM Image Storage • Management Agent • Benchmark components • Workload VM images • Distributed on VM Image Storage • Running Scripts • Installed on Management Agent
Internal Running Process • For every class of workload • Initialize the environment • Run the workload • Migrate the VM at different migration points • Fetch the metrics results • Collect all results and Compute an overall score • Management Agent automatically control the whole process
Experiments on Xen and KVM • Experiment Setup • SUT-XEN • VMM:Xen 3.3 on Linux 2.6.27 • Hardware:DELL OPTIPLEX 755, 2.4GHz Intel Core Quad Q6600,2GB memory, sata disk, 100Mbit network • SUT-KVM • VMM:KVM-84 on Linux 2.6.27 • Hardware:Same as SUT-XEN • VM • Linux 2.6.27, 512MB mem, one core • Workload • Internal: SPECjvm2008, cpu/mem intensive workloads • External: Light: single VM • Migration Points:Spreading the whole running
Experiments on Xen and KVM • Analysis • SUT-KVM intensively compress the data • Less migrated data and less total time • More overhead
Experiments on Xen and KVM • Analysis • SUT-XEN strictly control the “downtime” • Less downtime • More migrated data:Due to more rounds of pre-copy to decrease downtime
Experiments on Xen and KVM • Analysis • Conclusion • SUT-XEN less “downtime”and “overhead”, • But more consumption of network
1. Workload Complexity • Total test takes a long time • When workloads has too many combination • (I) Internal workload types: • Mail Server,App Server, File Server, Web Server, DBServer , Standby Server • (E) External workload types: • Heavy, Light • (P) Migration points quantity: • Considerable due to the long run time of each workload Total time = Runtime * N workload types Internal workload • N = I * E * P (* M ) External workload Migration Points Multiple migration
Possible Solutions • Speed up for migration points: • (Virt-LM’s current implementation) • More samples in a run • Using time-insensitive workloads • Micro operation: CPU, Memory, IO… • Different memory r/w intensity • Advantage: • Eliminate the “Migration Points” dimension • Internal workloads are reduced • Runtime of each each workload is shorten • Disadvantage: • Different from real scenarios • Hybrid • Test time-insensitive micro workloads • Analysis and predict typical workloads results • Redefine an average workload
2. Multiple/Concurrent Live Migration • Problem: Define overall metrics • Representative for platform’s maxium performance • Other concerns: • When average results decreased obviously • When system cannot afford • Possible solutions • Maximum sum of metrics • Define different thresholds VM … VM VM VM Platform (HW and VMM) Thresholds: Concurrent numbers Average decreased Obviously Sum decreased Obviously System cannot afford Maximum sum
3. Other Issues • Overall score computation • Virt-LM produces 4 scores as the final result • Definition of external workloads • Current implementation is simple • Repeatability • Need more experiment to exam • Migration points are not precisely arranged • Compatibility • Should be compatible to other VMM, besides Xen and KVM • Usability • More easy to configure and run
Current Work • Investigation on recent work on LM • Summarize the critical problems • Migration points • Workload complexity • Scoring methods • Multiple live migration • Present some possible solutions • Implement a benchmark prototype – Virt-LM More details in “Virt-LM: A Benchmark for Live Migration of Virtual Machine”(ICPE2011)
Future work • Improve and complete Virt-LM • Implement and test other solutions • Workload complexity • Multiple live migration • Overall score computation • Others • Test and compare their effectiveness and choose best one
Possible Work • Relation to the cloud benchmark • Enough migration cost in the workload • Although the cost maybe not a metric, we have to ensure workload could cause enough cost. • How fast could a cloud reallocate resources? • If implemented by live migration technology, it regards to following two factors: • 1. how many migrations (determined by) resource management and reallocation strategies • 2. how fast for each migration live migration efficiency & cost • Possible future work under cloud benchmark • We may work on how to ensure the workload produce enough live migration cost
Possible Work We hope to cooperate with other members, maybe join a sub-project related to live migration. We hope can contribute to the design of the Cloud Benchmark
Team Members • Prof. Dr. Qinming He • hqm@zju.edu.cn • Kejiang Ye • Representative of the SPEC Research Group • yekejiang@zju.edu.cn • Assoc. Prof. Dr. Deshi Ye • yedeshi@zju.edu.cn • Jianhai Chen • Chenjh919@zju.edu.cn • Dawei Huang • tossboyhdw@zju.edu.cn • …….
Virtualization Performance • Virtualization in Cloud Computing System • IEEE Cloud’2011, IEEE/ACM GreenCom’2010 • Performance Evaluation & Benchmark of VM • ACM/SPEC ICPE’2011, IWVT’2008 (ISCA Workshop), EUC’2008 • Performance Optimization of VM • ACM HPDC’2010, IEEE HPCC’2010, IEEE ISPA’2009 • Performance Modeling of VM • IEEE HPCC’2010, IFIP NPC’2010 • Performance Testing Toolkit for VM • IEEE ChinaGrid’2010
Publications [1] Live Migration of Multiple Virtual Machines with Resource Reservation in Cloud Computing Environments (IEEE Cloud’2011, Accept) [2] Virt-LM: A Benchmark for Live Migration of Virtual Machine (ACM/SPEC ICPE’2011) [3] Virtual Machine Based Energy-Efficient Data Center Architecture for Cloud Computing: A Performance Perspective” (IEEE/ACM GreenCom’2010) [4] Analyzing and Modeling the Performance in Xen-based Virtual Cluster Environment, (IEEE HPCC’2010) [5] Two Optimization Mechanisms to Improve the Isolation Property of Server Consolidation in Virtualized Multi-core Server, (IEEE HPCC’2010) [6] Evaluate the Performance and Scalability of Image Deployment in Virtual Data Center, (IFIP NPC’2010) [7] vTestkit: A Performance Benchmarking Framework for Virtualization Environments, (IEEE ChinaGrid’2010) [8] Improving Host Swapping Using Adaptive Prefetching and Paging Notifier, (ACM HPDC’2010) [9] Load Balancing in Server Consolidation, (IEEE ISPA’2009) [10] A Framework to Evaluate and Predict Performances in Virtual Machines Environment, (IEEE EUC’2008) [11] Performance Measuring and Comparing of Virtual Machine Monitors, (IWVT’2008, ISCA Workshop)