Profiling and Modeling Resource Usage of Virtualized Applications

Profiling and Modeling Resource Usage of Virtualized Applications Timothy Wood1, Lucy Cherkasova2, Kivanc Ozonat2, and Prashant Shenoy1 1University of Massachusetts, Amherst 2HPLabs, Palo Alto

Virtualized Data Centers • Benefits • Lower hardware and energy costs through server consolidation • Capacity on demand, agile and dynamic IT • Challenges • Apps are characterized by a collection of resource usage traces in native environment • Virtualization overheads • Effects of consolidating multiple VMs to one host • Important for capacity planning and efficient server consolidation

Application Virtualization Overhead • Many research papers measure virtualization overhead but do not predict it in a general way: • A particular hardware platform • A particular app/benchmark, e.g., netperf, Spec or SpecWeb, disk benchmarks • Max throughput/latency/performance is X% worse • Showing Y% increase in CPU resources • How do we translate these measurements in “what is a virtualization overhead for a given application”? New performance models are needed

Predicting Resource Requirements • Most overhead caused by I/O • Network and Disk activity • Xen I/O Model • 2 components • Dom0 handles I/O • Must predict CPU needs of: 1. Virtual machine running the application 2. Domain 0 performing I/O on behalf of the app VM Domain0 Requires several prediction models based on multiple resources

T1 T1 T1 T1 T1 Network CPU Disk Problem Definition Native Application Trace ? ? Dom0 CPU VM CPU Virtualized Application Trace

Native Virtual + CPUUtil App 1 App 2 VM 2 Dom 0 VM 1 Why Bother? • More accurate cost/benefit analysis • Capacity planning and VM placement • Impossible to pre-test some critical services • Hypervisor comparisons • Different platforms or versions

Virtual system usage profile Native system usage profile model ? Our Approach • Automatedrobust model generation • Run benchmark set on native and virtual platforms • Performs a range of I/O and CPU intensive tasks • Gather resource traces • Build model of Native --> Virtual relationship • Use linear regression techniques • Model is specific to platform, but not applications • Automate all the steps in the process Can apply this general model to any application’s traces to predict its requirements

Microbenchmark Suite • Focus on CPU-intensive and different types of I/O-intensive client-server apps • Benchmark activities: • Network-intensive: download and upload files • Disk-intensive: read and write files • CPU-intensive • Need to break correlations between resources • High correlation between packets/sec and CPU time • Simplicity of implementation • based on httperf, Apache Jmeter, Apache Web Server and PHP Microbenchmarks are easy to run in a traditional data center environment

… … virtual native Model Generation model ? Set of equations to solve: Model VM: Set of equations to solve: Model Dom-0:

Building Robust Models • Outliers can considerably impact regression models • Creates model that minimizes absolute error • Must use robust regression techniques to eliminate outliers • Not all metrics are equally significant • Starts with 11 metrics: 3 CPU, 4 Network, and 4 Disk • Use stepwise regression to find most significant metrics • Evaluate outcome of microbenchmark runs and eliminate erroneous and corrupted data Correct data set is a prerequisite for building an accurate model

Performance Evaluation: Testbed Details • Two hardware platforms • HP ProLiant DL385, 2-way AMD Opteron, 2.6GHz, 64-bit • HP ProLiant DL580, 4-way Intel Xeon, 1.6GHz, 32-bit • Two applications: • RUBiS (auction site, modeled after e-Bay) • TPC-W (e-commerce site, modeled after Amazon.com) • Monitoring • Native: sysstat • Virtual: xenmon and xentop • Measurements: 30 sec intervals

Questions • Why this set of metrics? • Why these benchmarks? • Why this process of model creation? • Model accuracy

65% 5% Importance of Modeling I/O • Is it necessary to look at resources other than just total CPU? • How accurate such a simplified model for predicting the CPU requirement of VM ? Definitely need multiple resources!

Benchmark Coverage Why these benchmarks? Using a subset of benchmarks leads to a poor accuracy model

Automated Benchmark Error Detection • Some benchmarks run incorrectly • Rates too high • Background activity • Remove benchmarkswith abnormally higherror rates Automatically remove bad benchmarkswithout eliminating useful data

Model Accuracy • Intel hardware platform • Train the model using simple benchmarks • Apply to RUBiS web application 90% of Dom0 predictions within 4% error90% of VM predictions within 11% error

Second Hardware Platform • AMD, 64bit dual CPU, 2.6Ghz Produces different model parametersPredictions are just as accurate

1.7 x nat_CPU 1.4 x nat_CPU Different Platform’s Virtualization Overhead To predict virtualization overhead for different hardware platforms require building their own models Different platforms exhibit different amount of CPU overhead

Summary • Proposed approach builds a model for each hardware and virtualization platform. • It enables comparison of application resource requirements on different hardware platforms. • Interesting additional application: helps to assess and compare “performance” overhead of different virtualization software releases.

Future Work • Refine a set of microbenchmarks and related measurements (what is a practical minimal set?) • Repeat the experiments for VMware platform • Linear models – are they enough? • Create multiple models for resources with different overheads at different rates • Evaluation of virtual device capacity • Define composition rules for estimating resource requirements of collocated virtualized applications

Questions?

Profiling and Modeling Resource Usage of Virtualized Applications