250 likes | 365 Views
Friendly Virtual Machines Zhang,Bestavros etc., Boston Univ. ACM/USENIX VEE 2005. CSE 598c April 17, 2006 Bhuvan Urgaonkar. Problem Setting. Growing trend of hosting applications at third-party platforms Two challenges Isolation, security to co-located applications
E N D
Friendly Virtual MachinesZhang,Bestavros etc., Boston Univ.ACM/USENIX VEE 2005 CSE 598c April 17, 2006 Bhuvan Urgaonkar
Problem Setting • Growing trend of hosting applications at third-party platforms • Two challenges • Isolation, security to co-located applications • Efficient and fair resource allocation • Virtualization seen as a promising approach for isolation • What about resource allocation?
Challenge - Resource Allocation in Hosting Environments • Traditional solutions • Over-provisioning => wasteful • Fair schedulers in the OS, dynamic provisioning, admission control • Complex • Deprive the application of meaningfully adapting its behavior to match available resources • Against the famous end-to-end argument developed in the networking community
End-to-end Argument • Clark et. al • A functionality should be pushed to the higher layer whenever possible • IP network implements packet forwarding, leaving congestion control to end systems • When applied to hosting platforms • Let the applications decide how many resources they need
How do VMs make end-end idea realizable? • In a traditional hosting system, applications would have to be modified • Always undesirable, often impossible • In a virtualized hosting system • VMM is like OS, guest OS is like application • Guest OS modification not so unacceptable • E.g., Xen, Denali • Main idea: It is possible to achieve good efficiency and fairness using “friendly” virtual machines
Outline • Motivation • Approach • Implementation • Evaluation • Conclusions
Friendly Virtual Machine • Not malicious • Dynamically adapts its resource needs to system conditions • Inspiration: AIMD congestion control in TCP • Gradually increase resource requirements, back-off when resource contention increases • How a TCP researcher would approach the resource management problem in data centers
System Goals • Efficiency • Resources should not be overloaded • E.g., Heavy paging during overload => low throughput • Resources should not be unnecessarily underutilized • Fairness • Each VM is allocated a proportional share of the bottleneck resource for that VM
Overload Detection • Unlike TCP, there are multiple resources to consider • CPU, virtual memory, network bandwidth • Resource utilization metrics not reliable • E.g., CPU util may be high but the bottleneck may be the memory sub-system • Use application-centric metrics like response time or throughput
Overload Detection • Virtual Clock Time (VCT) • Real time interval between consecutive virtual clock cycles • Bottleneck resource • The resource that is the first to trigger a significant increase in VCT • Bottleneck-equivalence classes • Detection: Measure the ratio of current VCT to minimum VCT observed • Compare with a threshold (2)
Adaptation Mechanisms • Control number of processes/threads • In practice, suspending running processes may not be a good idea • Alternatives • Suspend less important (e.g., younger) processes • Don’t allow new processes instead of suspending existing ones • Rate control by forcing VM to sleep • Follow an AIMD style adaptation that converges to fair/efficient allocation • Paper presents a control-theoretic model to prove convergence/stability properties
Salient Features • Underlying system requirements • Schedulers should be unbiased like round-robin, unlike multi-level feedback • VMM should implement resource policing to enforce AIMD behavior • Various adaptation strategies can co-exist • Think TCP-reno, TCP-tahoe, … • Suggestion: VMM could provide incentives for friendly behavior
Discussion • Is this system practical? • Rate of adaptation • Would it be fast enough for hosted applications? • Applications need resources soon after overload starts • How would the system behave with biased schedulers? • Can the adaptation mechanism be extended to handle different levels of importance? • This system might punish an application precisely when it is crucial for it to service its workload • E.g., An e-commerce app during Thanksgiving • Global knowledge can be crucial for efficiency • E.g., LRU page replacement • Security, isolation • To me it seems this would be as secure as a system with a more heavy-weight VMM
Implementation • User-mode Linux • Implement adaptation of number of processes and rate control • 500 lines of code
Outline • Motivation • Approach • Implementation • Evaluation • Conclusions
Memory intensive benchmark - Performance metrics vs # VMs • Linux suspends processes arbitrarily when excessive thrashing occurs, their system spreads the punishment evenly
Benchmark - Performance metrics vs # threads/VM (2 VMs) • Graceful degradation
How not to do Evaluation • No confidence intervals! • Observations for light loads are meaningless • Pick someone your own size • Of course, their system is better than vanilla UML, so what? • Should have compared with a system that implements fair schedulers
Apache - 4 VMs • Graceful degradation
Evolution of VCT w/ UML • Unfairness at high load
Evolution of VCT w FVM • Fair CPU allocation
Tput per VM w UML • Unfairness at high load • Unfair CPU allocations due to different paging treatment and process suspension
Per VM Tput w FVM • Fair behavior
Conclusions • Distributed, application-driven resource allocation • (+) Cool idea • (-) Needs more research to be convincing • Experimental evaluation not satisfactory