410 likes | 652 Views
Network Aware Resource Allocation in Distributed Clouds. Contribution. Develops efficient resource allocation algorithms The developed 2-approximation algorithm for optimum Data Center(DC) selection is found to be quite efficient
E N D
Contribution • Develops efficient resource allocation algorithms • The developed 2-approximation algorithm for optimum Data Center(DC) selection is found to be quite efficient • Develops a heuristic for partitioning the requested resources among the chosen DCs and racks • Minimizes distance (latency) between the selected DCs • Simulations show that this approach yields significant gains
Introduction • Resource allocation – a key function of cloud management and automation • Resource allocation algorithms have high impact on performance of applications • Also affects the efficiency of DCs in accommodating requests • User requests require allocation of Virtual Machines(VMs) • To satisfy these requests, resource allocator maintains updated list of resources available at DCs, current allocations and future requirements.
Introduction • User requests include number of VMs and the communication links required between the VMs • Automation software’s objective is to choose the DC and rack such that overall resource usage is minimized and optimal performance is achieved • These two goals are complimentary • Usually involve attempts at allocating all requested resources onto a single rack– not always possible • Thus, for best results, resource allocation algorithms that are capable of handling many scenarios are required
Introduction • Fragmentation of user requests reduces performance • Difficult to solve fragmentation This paper focuses on resource allocation problem in distributed cloud systems spread out geographically over WAN Target : latency
System ArchitectureDistributed Cloud • Requests should be handled by DCs close to them – helps improve performance • Racks consist of blade servers, each containing many cores • Communication between multiple blade servers within the same rack happen via TOR switch • Two different racks communicate using aggregator switch • DC networks designed with assumption of locality of communication
System ArchitectureDistributed Cloud • As distance between machines increases, the bandwidth decreases • Bandwidth depends on physical machines that the Virtual Machines(VM) are assigned to • Overall efficiency of a DC also depends on this • Number of requests serviceable by the DC also depends on this
System ArchitectureCloud Management and Automation S/W • Prior knowledge about communication links may not be available • Automation S/W have to assign resources based on worst case conditions and then re-optimize • There are also other conditions that need to be satisfied • Number of VMs / DC (for fault tolerance) • Automation S/W computes mapping of user requests to physical machines
System ArchitectureCloud Management and Automation S/W • The output of the cloud automation software is a mapping of VMs to physical resources • The software interacts with Network Management System (NMS) and the local Cloud Management System (CMS) • The cloud optimization software has two functionalities • Track resource usage • Optimize assignment of user requests • Assignment of user requests consists of identifying DCs and machines • Goal: To reduce inter-DC, intra-DC traffic
System ArchitectureCloud Management and Automation S/W • Assignment of DCs is done in 4 steps • DC Selection • Identify DCs based on user constraints and availability • Identify subset of DCs that minimize latency • Partitioning Across DCs • Minimize inter-DC traffic • Adhere to given constraints and partition VMs accordingly • Rack, Blade, Processor selection • Identify physical computational resources in the DCs • Goal : Identify machines with low inter-DC traffic • VM Placement • Assign individual VMs to physical resources • Minimize inter-rack traffic
System ArchitectureData Center Selection • Select DCs that meet • All specifications and constraints • Optimize network resources • Maximize application performance • Use an algorithm that selects a subset of DCs with least hops • Handle other constraints such as maximum or minimum VMs / DC
System ArchitectureData Center Selection • DC selection problem – sub-graph selection problem • Given G = (V,E,w,l) • V – Data Centers • E – Path between DCs • w – number of available VMs at DC • l – distance of these paths • Note : • If there are constraints on maximum number of VMs / DC, w takes this value instead • If there is a constraint of the minimum number of VMs / DC, DCs with fewer VMs are omitted
System ArchitectureData Center Selection • Let ‘s’ be number of VMs requested • Problem : Find sub-graph of G whose sum is at least ‘s’ with minimum diameter • Goal : Find sub-graph with minimum length of longest edge • NP-hard problem
System ArchitectureData Center Selection • This algorithm finds a star topology centered at v • Diameter of output sub-graph is at most 2x diameter of optimal sub-graph
System ArchitectureData Center Selection Running Time • FindMinStar has to be sorted O(nlogn) • N number of DCs • Computing diameter O(n2) • O(FindMinGraph) = n * O(FindMinStar) = O (n3)
System ArchitectureMachine Selection within DC • Goal : Find machines that reduce inter-rack traffic • DC topology is a tree topology • Root – core switch • Children – top-level switches • Leaf – racks • Given the tree representation of the DC (T) and total number of VMs (s) to be placed • Find sub-tree with minimum height that has weight at least equal to ‘s’
System ArchitectureVirtual Machine Placement • Heuristic algorithms required for assigning individual VMs to DCs and CPUs within DCs • Problem is a variant of graph partitioning and k-cut problem • User request represented as graph G = (V,E) • Nodes represent VMs to be placed • Edges represent connections between them • Goal : Partition G into disjoint sets c1, c2…cm such that communication along vertices is minimized • If traffic is asymmetric, take the average
System ArchitectureVirtual Machine Placement • Algorithms 4,5 give heuristic solution to partition problem • Optimized using Keringhan–Lin heuristics • Runtime : • O(n2logn)
Simulation Results • Results compared to random approach and greedy algorithm • Random approach selects random DC and places as many VMs as possible in the DC • Greedy selects DC with maximum VMs • To measure performances • Random topology created • Random user requests generated • Maximum distance between any two VMs measured
Simulation Results • Location of DCs randomly selected within a 1000x1000 grid • Distance between DCs is the Euclidean distance between points • Five different distributed cloud scenarios • 100 DCs • 75 DCs • 50 DCs • 25 DCs • 10 DCs • However, average machines on each cloud is the same
Simulation ResultsI Experiment • Measuring diameter of placement for a single request of 1000 VMs • Approximation algorithm performs 79% better • Note : Diameter decreases as number of DCs decreases
Simulation ResultsII Experiment • Study cloud systems with series of user requests • Two experiments • 100 requests for 50 – 100 VMs • Requests are uniformly distributed • Large requests • 500 requests for 10 – 20 VMs • Small requests • Note : In both experiments, average VMs requested is the same
Simulation ResultsII Experiment • Greedy performs better than random by 32.6% and 66.5% • Approximation algorithm performs better than greedy by 83.4% and 86.4% • Why do larger requests require higher diameter?
Simulation ResultsIII Experiment • Studies performance of cloud system when additional constraints are given • Same requests as previous experiment • Resilience is defined as ratio of total VMs to maximum VMs at any DC • Requests need to be placed in at least resilience number of DCs
Simulation ResultsIII Experiment • Larger requests have longer diameter • As resilience increases, diameter increases • What is different about these results?
Simulation ResultsIII Experiment • Performance of heuristic algorithm • Given communication requirements and available capacity of DCs, algorithm computes optimal placement of VMs that minimizes inter-DC traffic • Comparison of heuristic algorithm with greedy and random algorithms • Random assigns random DC to each VM • Greedy selects DCs in decreasing order of availability • While selecting VMs, it chooses VMs with maximum total traffic first
Simulation ResultsIII Experiment • Experiment assigns a request of 100 VMs to DCs • Bandwidth fixed randomly between 0 and 1 Mbps • Inter-DC traffic for assignment of these VMs to k DCs (k = 2,…,8) was studied • Available resources at each DC were between 100/k and 200/k • Hence 100 VMs were being assigned to DCs consisting of 100 – 200 VMs
Simulation ResultsIII Experiment • For all algorithms inter-DC traffic increases as number of DCs increase…Why? • Greedy algorithm performs better than random by 10.2% • Heuristic algorithm performs better than greedy by 4.6%
Simulation ResultsIII Experiment • When the DCs did not have excess capacity, inter-DC traffic was higher for heuristic algorithm by 28.2% • Heuristic algorithm performed better than the other two algorithms by 4.8% • Greedy and Random had similar performances
Simulation ResultsIV Experiment • In this experiment, effect of VM traffic on inter-DC traffic is studied • The percentage of links with traffic is varied between 20% and 100% and inter-DC traffic is measured • The DCs have no excess capacity in these experiments • Result: inter-DC traffic grows linearly with percentage of links with traffic for all algorithms
Conclusions • Main contribution is development of algorithms for network-aware resource allocation of VMs in distributed cloud systems • Need for these efficient algorithms :Inter-DC traffic may be very expensive • 2-approximation algorithm provided for selection of DCs • This algorithm can also be used for rack selection within DC but using prior knowledge about network topology within DC gives better results • Heuristic algorithm for mapping VMs to resources within DC
Related Work • Graph partitioning problems • K-cut problem • Maximum sub-graph problem • Assigning VMs inside DCs studied in Improving the scalability of data center networks with traffic-aware virtual machine placement