260 likes | 385 Views
Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters. Presenter: Xiaoyu Sun . Cluster Computing. Users have to know cluster very well. administrative privileges. What is Cloud Computing?.
E N D
Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun
Cluster Computing Users have to know cluster very well administrative privileges
What is Cloud Computing? Cloud computing provides computation, software, data access, and storage resources without requiring cloud users to know the location and other details of the computing infrastructure.
Characteristics of Cloud Computing • Empowerment • Users control resource by themselves not by a centralized IT service • Agility • users' ability to re-provision technological infrastructure resources. • Application Programing Interface • Cost • Device and Location Independence • enable users to access systems using a web browser regardless of their location or what device they are using • Virtualization • servers and storage devices to be shared and utilization be increased • Reliability and Scalability • Performance • Monitor by web services as the system interface • Security • providers are able to devote resources to solving security issues that many customers cannot afford • Maintenance • Applications don’t need to be installed on each user's computer and can be accessed from different places
Purpose Describe a system that enables an organization to augment its computing infrastructure by allocating resources from a Cloud provider. Provide various scheduling strategies that aim to minimize the cost of utilizing resources from the Cloud provider. Evaluate the proposed strategies, considering different performance metrics; namely average weighted response time, job slowdown, number of deadline violations, number of jobs rejected, and the moneyspent for using the Cloud.
Cloud Computing Strategy sets Cloud scheduler redirection strategy scheduling strategy Site scheduler Figure 1:The resource provisioning scenario
Backfilling Policies • Conservative • each request is scheduled when it arrives in the system, and requests are allowed to jump ahead in the queue if they do not delay the execution of other requests. • Aggressive • Only the request at the head of the waiting queue called the pivotis granted a reservation. Other requests are allowed to move ahead in the queue if they do not delay the pivot. • Selective • Requests are given reservations if they have waited long enough in the queue.Long enough is determined by the requests’ expansion factor: Xfactor = (wait time + run time)/run time (1) The threshold is given by the average slowdown of previously completed requests.
Strategy Sets • Naïve: • Both site and cloud schedulers use Conservative backfilling to schedule the requests • The redirection algorithm is executed at the arrival of each job at the site • Use cloud provider when the request cannot start immediately on local cluster
Strategy Sets • Shortest Queue: • Aggressive backfilling • First-Come-First-Served (FCFS) manner • At the arrival or complete of each job at the site • Compute the ratio of number of VMs required by requests to the number of VMS available • Redirect request if cloud provider’s number is smaller
Strategy Sets • Weighted Queue: • Aggressive backfilling • First-Come-First-Served (FCFS) manner • Number of VMs that can be borrowed from cloud provider is the number of VMs required by requests minus VMs in use
Strategy Sets • Selective • Selective backfilling • Compute the ratio of number of VMs required by requests to the number of VMS available • When the request’s xFactor exceeds the threshold, the scheduler makes a reservation at the place that provides the earliest start time.
Experiments • Simulation of two-month-long periods • SDSC Blue Horizon machine with 144 nodes • Number of VMs • Price of a virtual machine per hour • Amazon EC2’s small instance: US $0.10 • Network and storage are not considered • Values are average of 5 simulation runs
Performance Metrics • Average Weighted Response Time(AWRT) of site k: • Tk: requests submitted to site k • Pj: the runtime of request j • mj: the number of VMs required by request j • ctj: request j’s completion time • stj: the submission time of request j
Performance Metrics • Performance Improvement Cost of a strategy set st: • Amount spent is the amount spent running virtual machines on the Cloud provider • AWRTbase is the AWRT achieved by a base strategy(FCFS with aggressive backfilling) that schedules requests using only the site's resources • AWRTstis the AWRT reached by the strategy st when Cloud resources are also utilized.
Performance Improvement Cost • Using Lublin99's model to generate different workloads: • Umed: the mean number of virtual machines required by a request to log2m-umed where mis the maximum number of virtual machines allowed in the system, from 1.5 to 3.5. • Barr: the inter-arrival time of requests at rush hours, from 0.45 to 0.55 . • PB: the proportion p of the first gamma in Lublin99's model is given by p = pa * nodes + PB,from 0.5 to 1.0.
Performance Improvement Cost These three graphs show the site's utilization using the base aggressive backfilling strategy without Cloud resources The larger the value of Umed, the smaller the requests. • The larger the value of PB, the smaller the duration of the requests
Performance Improvement Cost Requests’ size Requests’ arrive time Requests’ duration
Deadline Constrained Applications • Users may have stringent requirement on when the virtual machines are required • Deadline constrained requests have: • Ready time • Duration • Deadline • Cost of using Cloud resources used to meet requests’ deadlines and decrease the number of deadline violations and request rejections
Deadline Aware Strategies • Conservative • both local site and Cloud schedule requests using conservative backfilling. • Places a request where it achieves the best start time • If rejections are allowed and deadline cannot be met, reject the request • Aggressive • both local site and Cloud use aggressive backfilling to schedule requests • Earliest Deadline First • If request deadlines are broken in the local cluster, try the cloud provider • If rejections are allowed and deadlines are broken, reject the request
Cost of Reducing Deadline Violations • The non-violation cost is given by: • Where: • Amount_spentst: amount spent with Cloud resources • Violbase: the number of deadline violations under the base strategy set (aggressive backfilling and an Earliest Deadline First manner) • Violst:the number of deadline violations under the evaluated strategy set
Deadline calculation • The deadline calculation is given by: • Where: • stj:the request j's submission time • ctj: the completion time. • taj: the difference between the request's completion and submission times. • sf : a stringency factor that indicates how urgent the deadlines are.
Cost of Reducing Deadline Violations sf=1.7 sf=0.9 sf=1.3
Cost of Reducing Deadline Violations Tight deadlines Normal deadlines Relaxed deadlines
Conclusions • Different strategy sets can yield different ratios of performance improvement to money spent • Naïve strategy has a higher performance improvement cost • Selective strategy provides a good ratio of money spent to job slowdown improvement • Using cloud provider to meet job deadlines • Less than $3,000 were spent to keep the number of rejections close to zero