280 likes | 397 Views
Distributed Systems Meet Economics: Pricing In The Cloud. Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter: Sajala Rajendran . Abstract.
E N D
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter: Sajala Rajendran
Abstract • Pricing Scheme in cloud computing – bridge that decouples users from cloud providers • Relationship between Cloud computing and pricing has brought a significant change to the system design and optimization • Studies conducted on Amazon EC2 and on a local cloud computing testbed
Introduction • Pay-as-you-go model: • Cloud Providers have a pricing scheme for their users. • Users utilize cloud at a very low cost • Profit for providers • Variety of applications – storage backup, e- commerce, high performance computing • “two-party” computation with pricing as the bridge • Pricing depends on two factors • System Design and Optimization • Fairness and Competitive pricing
Contd… • Pricing induced interplay between systems and economics • Cost as an explicit and measurable system metric • Pricing fairness • Evolving system dynamics • Cost of failures • Experiments conducted on Amazon EC2 and Spring have the following results: • Optimization for cost is hard for user • Pricing unfairness • Different system configuration significantly imapcts cost and profit • Failure occurrences
Background on Pricing • Pricing • Pay-as-you-go model Pricing Fairness Competition Personal Social
Pay-as-you-go Model • Pricing helps to shape how systems are used • Amazon charges $0.095/virtual machine hour • Many pricing schemes are introduced • Several alternative pricing schemes have been proposed • E.g. Gurmeet Singh and Carl Kesselman suggested dynamic pricing on resource consumption.
Workloads • Postmark • I/O intensive benchmark • Measures transaction rates for a workload approximating an Internet email server • For experiment : File size 5 GB and number of transactions is 1000 • PARSEC (Princeton Application Repository for Shared Memory Computers) • Benchmark suite for chip-multiprocessors • Composed of multithreaded programs
9 applications and 3 kernels • Blackscholes– High performance computing • Dedup– Storage archival • For experiment: 184 MB input data for Dedup and 10 million options for Blackscholes • Hadoop • Hadoop 0.20.0 for large scale data processing • WordCount and StreamSort • For experiment: Input data set is 16 GB
Methodologies • Amazon EC2 • Charged according to the pricing scheme of Amazon • Cost user = Price x t • t : total running time of the task (Hours) • Price : price per virtual machine hour • Excluding storage and data transfer costs • Spring System • Provides virtual machines to the users • Consists of two modules – VMM (Virtual machine monitor) and Auditor • Provider Profit = Payment from users – Total provider expense
Hamilton’s Estimations • Total cost of full burdened power consumption • Cost full = p x Praw x PUE • p - Electricity price (dollars/KWh) • Praw - Total energy consumption of servers and routers • PUE – PUE value of the data center • Total provider cost = (Cost full + Cost amortized ) x Scale • Scale = Estimated total cost -------------------------------- Cost full + Cost amortized • Cost amortized = C amortizedUnit x t server • C amortizedUnit - Amortized cost per hour per server • t server - Elapsed time on the server (hours)
Estimation of Praw • For a server, the energy consumption is calculated based on resource utilization • Pserver = Pidle + ucpu x c0 + uio x c1 • ucpu - CPU utilization(%) • uio - I/O bandwidth (MB/sec) • c0 and c1 - coefficients in the model
Experiment Setup – Amazon EC2 • 2 virtual machine types – Small and Medium instances
Experiment Setup – Spring • Virtual box is used • Host OS – Windows Serer 2003 and guest OS is Fedora 10.
Eight core machine for evaluating single-machine benchmarks • Cluster consisting of 32 four-core machines for evaluating Hadoop • Power meter used for measuring power consumption of a server • Total dollar cost is calculated based on Hamilton’s estimations on a data center of 50,000 servers. (PUE =1.7 , Scale = 2.24 , Energy price = $0.07/kWh , C amortized Unit = $0.08/hr
Contd.. • An Intel 80 GB X25-M SSD is used to replace a SATA hard drive adjusting the amortized cost in the machine with an SSD to $0.09/hr. • System throughput = Number of tasks finished/hr + user costs + provider profits. • Efficiency of Provider’s investment ROI = Profit/Cost provider x 100 %
User Optimization on EC2 • Choosing suitable instance type is important for both performance and cost
Provider Optimization on Spring • Based on varying the number of concurrent VM’s from one to four on the same physical machine.
Observations • Consolidation reduces power consumption of 150% and 21% on Praw for Blackscholes and Postmark respectively • Decrease of power cost and increase of user cost, increases provider’s profit significantly. • ROI increases to 180% on Postmark and 340% on Blackscholes. • Suitable consolidation strategy is necessary • Flaw : degradation of system throughput upto 64%.
Multi-machine Benchmarks on Hadoop • Increase in provider’s profit of about 135% and 118% on ROI for WordCount and StreamSort respectively. • Degradation of system throughput with a reduction of 12% and 350%
Pricing Fairness • Personal Fairness
Social Fairness • Coefficient of variation , cv= stdev -------------- X 100% mean • Maximum Difference = Hi- Lo ------------- X 100% Lo • Variations of different runs on the same instances in Amazon EC2 • Each single machine benchmark is run ten times • As more VM’s are consolidated onto the same physical machine users need to pay more money.
Postmark incurs 40% more cost than its best case • Cost of running Postmark ten times on three different instances on EC2
Different Hardware Configurations • Elapsed times of Postmark are 180 and 400 seconds on SSD and hard disk respectively. • SSD reduces user’s cost by 120% and decreases provider’s ROI from 40% to -44%
Failures • Executing Hadoopin Spring was successful but resulted in one exception with a message “ Address already in use “ on Amazon EC2. • Transient failures also occur. Running StreamSort using Hadoop on eight VMs in Spring, resulted in a eight time increase in the total elapsed time. • All these could lead to higher user costs
Conclusion • Cloud computing bridges distributed systems and economics by using a pricing scheme that connects providers with users. • Experiments conducted on Amazon EC2 and spring have shown that cost variations on both result in social unfairness of the current pricing scheme • Setting that achieves minimum cost differ from that of the best performance. • Providers need to fine-tune its pricing structure to balance between their profits and the users.