CPU Ready Time in VMware ESX Server

CPU Ready Time in VMware ESX Server Bill Shelden bill.shelden@PERFMAN.com

CPU Ready Time in VMware ESX Server A performance metric produced by VMware ESX Server is called CPU ready time. It measures the time a virtual CPU in a virtual machine running under ESX Server is ready to be dispatched but is not dispatched. CPU ready times are examined from a real ESX Server system and from a number of published benchmarks and are found to be too high to be explained solely by the contention experienced by virtual CPUs for the physical CPUs in the server running ESX Server. Some reasons for virtual CPUs accumulating CPU ready time when physical CPUs are available are examined. One such reason that has been extensively discussed is called co-scheduling and applies to SMP virtual machines. In multiprocessor servers an additional factor affects CPU ready time. Virtual CPUs that have been scheduled on a particular physical CPU will be given a preference to run on the same physical CPU again. In this case the ESX Server scheduler may choose to let a few cycles on a physical CPU stay idle rather than move a ready virtual CPU to another physical CPU. A model of these latter phenomena is discussed and the model’s predicted CPU ready times are compared to the real data and to the benchmark data. Abstract

Topics • Investigate % CPU Ready Time on an internal PERFMAN server called LPPerfTest on server devclusterhost2 • Is the CPU Ready time measured reasonable? • Compare it to a simple model • Discuss % CPU Ready Time and its causes • Discuss a more robust model of an ESX Server system • Apply to uniprocessor benchmarks • Apply to mixed UNI and SMP benchmarks • Apply to devclusterhost2 • Conclusions

A PERFMAN Internal ESX Hostdevclusterhost2 • 8 Physical CPUs • Running ESX Server 3.5 • 12 virtual machinesvCPUs • LPPerfTest 4 • MITest 2 • DevPortalSQL 2 • Win2008Ent 2 • DevPortalTest2 2 • DevSrvSMPT 1 • DevNas1 1 • DevPortalTest 1 • DevWiki 1 • Win64Test 1 • VirtualCenter2 1 • ISMqa2 1 • 19 total virtual CPUs

Spike in % CPU Ready Time at 6 AM on devclusterhost2About 1800 seconds of CPU Ready Time

LPPerfTest at 6 AM is an anomalyPERFMAN ROT is % CPU Ready Time < 5% for a VM

How busy is the server devclusterhost2?Utilization of 8 cores is about 30% at 6 AM

% CPU Ready Time in VMware ESX Server • References: • VMware ESX Server 3 Ready Time Observations • Co-scheduling SMP VMs in Vmware ESX Server • VMware vSphere 4: The CPU Scheduler in VMware ESX 4 • CPU ready time is the time a virtual machine must wait in a ready-to-run state before it can be scheduled on a CPU. • It is expressed as a percentage of the measurement interval • E.g. a VM with % CPU Ready Time of 5% in a 3600 interval is waiting in a ready-to-run state .05 x 3600 = 180 seconds. • Makes sense to view it in the context of the VM’s CPU service time • CPU Ready Time / CPU Busy Time • Call this CPU Ready Time per CPU Busy Time

CPU Ready Time per CPU Busy Timefor devclusterhost2’s Virtual Machines

Causes of CPU Ready time in VMware ESX Server • Physical CPUs are unavailable • Co-scheduling of SMP virtual machines • CPU preference in multiprocessor servers • Other reasons • Overall server utilization • Load correlation • Number of virtual machines • Number of virtual CPUs in the VMs

Co-scheduling • Proportional-Share Based Algorithm • VM Priority is based on used CPU as a fraction of entitled CPU • Smaller means higher priority • Strict co-scheduling in ESX Server 2.x (2003) • Cumulative skew value for each vCPU • Progress is running or idling • Skew increases if not making progress • Idle vCPU does not accumulate skew • Once skew exceeds threshold, all sibling VMs must be co-started • Relaxed co-scheduling in ESX Server 3.x (2006) • Only those sibling VMs that are skewed must be co-started • Further relaxed co-scheduling in ESX Server 4. • Physical CPUs may be available while VMs are in a ready-to-run state.

CPU Preference • Multiprocessor systems • A vCPU that has been scheduled on a particular CPU will be given preference to run on the same CPU again • Performance advantages of finding data in the CPU cache.

Summary observations about devclusterhost2 • In a 1 hour interval: • VM’s in a server running ESX Server 3.5 are experiencing • 1800 seconds of CPU ready time (50% of 3600 secs) • 8640 seconds of CPU service time (30% of 8 pCPUs) • LPPerfTest, one of the virtual machines, is experiencing • 1069 seconds of CPU ready time • 4800 seconds of CPU service time • % CPU Ready Time for LPPerfTest = 29.7% > 5% ROT • It looks like LPPerfTest is the cause/victim of the problem because of the spike in its utilization at 6 AM. • The server has 8 physical CPUs and is running at about 30% busy • Is this reasonable? • I did not think so. Let’s investigate by modeling.

First Model • Build a simulation model of a system with • 19 customers (19 vCPUs) • Contending for N physical CPUs where N = 8, 7, 6,,, • Providing about 8640 seconds of CPU service in a 3600 second interval • Examine the CPU queue times predicted by the model and compare to devclusterhost2 at 6 AM • The model used is the Machine Repair Model which has a well-known analytic solution • A simulation model was used.

Machine Repair Model The repair center WS = Mean service time = Mean time to repair No. of Servers = No. of Repairmen = 2 Population = No. of Machines = 12 Delay Center with WS = Mean service time = Mean time to failure The shop floor

Devclusterhost2 is behaving more like a 3 or 4 pCPU server running at 60-80% Busy

Use a more realistic Model for a VMware ESX Host • Model characteristics • Number of server physical CPUs • Number of Virtual Machines • Virtual CPUs in each VM • Server Utilization of each VM • Population in each VM (number of processes) • VM dispatching • Co-scheduling for SMP VMs • CPU preference on servers with multiple physical CPUs • Apply to two benchmarks described in the paper • Uniprocessor benchmarks • Benchmarks on a 4-CPU server with mix of uniprocessor and SMP virtual machines • Apply to devclusterhost2

VMware ESX Host Model VM2 Delay Center VM1 Delay Center Release vCPU Allocate vCPU CPU Release vCPU Allocate vCPU Server with 4 pCPUs • 2 Job Classes • One for each VM • Pop = WinMPL • Target Util/Tput • Allocate/Release node for each VM • Number of tokens = Number of vCPUs

Summary of Benchmarks

Uniprocessor Benchmarks (ESX Server 3.0) • Run on a server with a single physical CPU • No co-scheduling • No CPU preference • CPU Burner program set to consume 15% of a single physical CPU • 6 virtual machines each with one virtual CPU • Test started with single CPU burner in one virtual machine • The other five VM’s are idle • Every 10 minutes, another CPU burner program was started in another virtual machine • In the last 10 minutes, 6 VMs each running one copy of the CPU burner program

Uni Benchmark and Model Results

Benchmarks on 4-CPU Server (ESX Server 3.0) • Server has 4 physical CPUs • 10 minute (600 second) runs • Run 6 instances of the CPU burner program with each instance set to consume 50% of a single CPU • 6 x 50% of 1 CPU = 300% of 1 CPU • 300% of 1 CPU = 3 x 600 secs = 1800 secs • Utilization of 4-CPUs = 75% • Eight runs with combinations of VMs under ESX Server 3.0 • 6 UP • 5 UP 1 SMP • 4 UP 2 SMP • 3 UP 3 SMP • 2 UP 4 SMP • 1 UP 4 SMP 1 2-Burner • 4 SMP 2 2-Burners • 3 SMP 3 2-Burners

4-CPU Server benchmark results from paper

4-CPU Server BenchmarksModel only contention for Physical CPUs

Model Contention for pCPUsplus Co-scheduling of SMP VMs

Model pCPUs + Co-Scheduling + CPU Preference

Summary of 4-CPU Server Models

devclusterhost2 Model (ESX Server 3.5)

Comparing Devclusterhost2June 10, 2010 and September 2, 2010

Conclusions • % CPU Ready Time can be problematic in SMP VMs • It can be caused by Co-scheduling and CPU Preference • To limit CPU ready time consider: • Reducing the number of VMs • Reducing the load on the server • Reduce the number of virtual CPUs in VMs • Consider showing it as fraction of CPU Busy Time • ROT CPU Ready / CPU Busy < 0.2 for each VM

CPU Ready Time in VMware ESX Server