400 likes | 559 Views
Resource Management in Virtualization-based Data Centers. Bhuvan Urgaonkar Computer Systems Laboratory Pennsylvania State University. Data Center. Cluster of compute and storage servers connected by high-speed network Rent out resources in return for revenue
E N D
Resource Management in Virtualization-based Data Centers Bhuvan Urgaonkar Computer Systems Laboratory Pennsylvania State University
Data Center • Cluster of compute and storage servers connected by high-speed network • Rent out resources in return for revenue • Internet applications, Scientific applications, … • Revenue scheme expressed using SLAs
Resource Management in Data Centers • Goal: Meet application SLAs • Easy solution: Over-provision resources • Over-provisioning can be very wasteful • Energy, management, failures, … • Data center would like to maximize revenue! • Dynamic capacity provisioning: match resource allocations to varying workloads • Challenges: • Determining changing resource needs of applications • Effective sharing of resources among applications • E.g., server consolidation can reduce cost • Automating resource management
Resource Management in Data Centers • Goal: Meet application SLAs • Easy solution: Over-provision resources • Over-provisioning can be very wasteful • Energy, management, failures, … • Data center would like to maximize revenue! • Dynamic capacity provisioning: match resource allocations to varying workloads • Challenges: • Determining changing resource needs of applications • Effective sharing of resources among applications • E.g., server consolidation can reduce cost • Automating resource management
Motivation for Virtualized Hosting in Data Centers • Key idea: Design data center using virtualization • Virtual machine monitor (VMM) and virtual machine (VM) • A software layer that runs on a server and allows multiple OS/applications to co-exist • Each OS/application is given the illusion of its own “virtual” machine that it has to itself • Why is this good? • Consolidation of diverse OS/apps possible • Migration made easier • Small code of VMM => improved security • Not a new idea, but existing solutions are inadequate • Goal: Devise efficient resource management solutions for a virtualization-based data center
The Xen Virtual Machine Monitor • VMM = hypervisor • VM = domain • Para-virtualization • Special domain called Dom0 Dom0 Dom1 Dom2 Apache Web server Mysql database Windows’ Linux’ Xen hypervisor Hardware
Outline • Introduction and Motivation • Resource Management in a Xen-based Data Center • Resource Accounting • Resource Allocation and Scheduling • Performance Optimizations for Xen • Other Research • Concluding Remarks
Xen-based Data Center • Each application component runs within a Xen domain Online book-store Online game server Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware Physical machine # 1 Physical machine # 2
Resource Usage Accounting • Need for accurate resource accounting • Estimate future needs • Relate performance and resource consumption • Charge applications for resource usage • Accounting in Xen-based hosting • Statistics for each DomU can be gathered by hypervisor • E.g., number of bytes sent by a DomU • Hidden activity: CPU activity performed by Dom0 • Similar to activity done by a kernel for a process • Techniques to de-multiplex Dom0’s activity across DomUs • How much work does Dom0 have to do for each DomU?
Resource Allocation • Multi-time scale resource allocation • Server assignment: course time-scale • Scheduling: fine time-scale • Placement • Like a knapsack problem • What time-scale? • Migration versus replication
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Physical machine # 1 Physical machine # 2
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Physical machine # 1 Physical machine # 2
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Physical machine # 1 Physical machine # 2
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Message waits till yellow app gets the CPU Physical machine # 1 Physical machine # 2
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Message can be received Immediately if the yellow app gets the CPU Physical machine # 1 Physical machine # 2
Intelligent Scheduling of Distributed Applications • Motivation: Co-scheduling of parallel applications • Schedule distributed communicating components together Physical machine # 1 Physical machine # 2
Co-ordinated Schedulingof Communicating Domains • Idea #1: Preferentially schedule a DomU when it receives data • Modify Xen CPU scheduler to give higher preference to receiving DomU • Important: Also need to ensure that Dom0 gets to run to take care of I/O • Scheduler should partition the CPU allocation for a DomU into those for Dom0 and DomU appropriately
Co-ordinated Schedulingof Communicating Domains • Idea #2: Try to schedule a sender DomU when it is expected to receive the response • An application knows best, but mods undesirable • Let the hypervisor learn from past behavior • E.g., query responses might be returning in 1-2 seconds • Idea #3: Anticipatory CPU scheduling • If a domain has sent/received data, it may be likely to do that again • E.g., queries may be issued in bursts • Trade-off between domain context switch and how much extra time you let a sender DomU continue
Multi-processor Scheduling • Idea: Dom0 should be scheduled together with a DomU doing I/O • Utilize the multiple CPUs to “co-schedule” a communicating DomU with Dom0 • Ensure domains that communicate a lot do not starve others • Relaxed fairness: 50% CPU over intervals > 1 second • Approach: Decay the CPU priority of communicating DomUs to ensure relaxed fairness is not violated
Outline • Introduction and Motivation • Resource Management in a Xen-based Data Center • Resource Accounting • Resource Allocation and Scheduling • Performance Optimizations for Xen • Other Research • Concluding Remarks
Performance Optimizations for Xen • Switching between native & virtual hosting • Dynamic merging and splitting of domains • Overbooking of memory • Improved migration techniques • Coalesce network packets directed to the same physical server
Performance Optimizations for Xen • Switching between native & virtual hosting • Dynamic merging and splitting of domains • Overbooking of memory • Improved migration techniques • Coalesce network packets directed to the same physical server
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication • (-) Increased CPU processing for coalescing and splitting packets • (+) Reduced interrupt processing at receiver Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Optimizing Network Communication • What kinds of packets can be coalesced? • TCP ACKs? Other packets? • Would it make sense to do anticipatory packet scheduling at the sender? Dom0 Dom0 Dom1 Dom2 Dom1 Dom2 Mysql database Apache Quake 1 Mysql Quake 2 Windows’ Linux’ Windows’ Linux’ Xen hypervisor Xen hypervisor Hardware Hardware
Outline • Introduction and Motivation • Resource Management in a Xen-based Data Center • Resource Accounting • Resource Allocation and Scheduling • Performance Optimizations for Xen • Other Research • Concluding Remarks
Provisioning a Directional Antenna-based Network • Directional antennas • Longer reach • Less interference => Increased capacity
Provisioning a Directional Antenna-based Network • Theoretical results • User-centric version • Fair bandwidth allocation • Optimal algorithm based on dynamic programming • Provider-centric version • Maximize revenue • NP-hard, 2-approximation algorithm • Ongoing work • Heuristics to incorporate mobility • Evaluation through simulation • Implementation … may be
Concluding Remarks • Resource mgmt. in virtualized environments • Provisioning wireless networks • Energy optimization in sensor networks • Distributed systems, Operating systems • Combination of analysis, algorithm design and experimentation with prototypes • Acknowledgements: • Faculty: Anand, Piotr, Wang-Chien • Students: Amitayu, Arjun, Ross, Shiva, Sriram