1 / 63

Comparison of Cloud Providers

Comparison of Cloud Providers. Presented by Mi Wang. Motivation. Internet-based cloud computing has gained tremendous momentum. A growing number of companies provide public cloud computing services. How to compare their profermance ?

nanji
Download Presentation

Comparison of Cloud Providers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparison of Cloud Providers Presented by Mi Wang

  2. Motivation • Internet-based cloud computing has gained tremendous momentum. A growing number of companies provide public cloud computing services. • How to compare their profermance? • Help customer choose a provider that best fits its performance and cost needs. • Help cloud provider know the right direction for improvements

  3. A Benchmark for Clouds • Introduction • The goal of benchmarking a software system is to evaluate its average performance under a particular workload • TPC benchmarks are widely used today in evaluating the performance of computer systems. • Transaction Processing Performance Council (TPC) is a non-profit organization to define transaction processing and database benchmarks . • However, they are not sufficient for analyzing novel cloud services

  4. Requirements of a Cloud Benchmark • Features and Metrics • The main advantages of cloud computing are scalability, pay-per-use and fault-tolerance • A benchmark for the cloud should test the these features and provide appropriate metrics for them.

  5. Requirements of a Cloud Benchmark • Architectures • Clouds may have different service architectures • A cloud benchmark should be general enough to cover the different architectural variants

  6. Problems of TPC-W • The TPC-W benchmark specifies an online bookstore that consists of 14 web interactions allowing to browse, search, display, update and order the products of the store. • The main measured parameter is WIPS, the number of web interactions per second(WIPS) that the system can handle. • TPC-W measure the cost by the ratio of total cost to maximum WIPS

  7. Problems of TPC-W • TPC-W is designed for transactional database systems. Cloud systems may not offer strong consistency constraints it requires. • WIPS is not for adaptable and scalable systems. Ideal clouds would compensate increasing load by adding new processing units. It is not possible to report the maximum WIPS.

  8. Problems of TPC-W • Metric for cost is not applicable for clouds. Different price-plans and the lot-size prevent to calculate a single $/WIPS number. • TPC-W does not reflect technical evolution of web applications • TPC-W lacks adequate metrics for measuring the features of cloud systems like scalability, pay-per use and fault-tolerance

  9. Ideas for a new benchmark • Features • Should analyze the ability of a dynamic system to adapt to a changing load in terms of scalability and costs • Should run in different locations • Should comprise web interactions that resemble the access patterns of Web 2.0 like applications. Multimedia content should also be included.

  10. Ideas for a new benchmark • Configurations: A new benchmark can choose between three different levels of consistency • Low: All web interactions use only the BASE (Basically Available, Soft-State, Eventually Consistent) guarantees • Medium: The web interactions use a mix of consistency guarantees ranging from BASE to ACID. • High: All web interactions use only the ACID guarantees.

  11. Ideas for a new benchmark • Metrics: Scalability • Ideally, clouds should scale linearly and infinitely with a constant cost per WI • Increasing the issued WIPS over time and continuously counting the WI in a given response time. Measure the deviation between issued WIPS and answered WI

  12. Ideas for a new benchmark • Metrics: Scalability Correlation Coefficient R2 R2(0, 1), R2 = 1 indicates perfect linear scaling fi+4 fi+3 fi+2 fi+1 yi+4 yi+3 yi+2 OR Non-linear regression function f(x) = xb b(0, 1), b = 1 indicates perfect linear scaling f(x) = xb yi+1 fi yi

  13. Ideas for a new benchmark • Metrics: Cost • Measure the cost in dollars per WIPS • Price plans might cause variations of the $/WIPS • Measure average and standard deviation of cost per WIPS Plans are fully utilized

  14. Ideas for a new benchmark • Metrics: Fault tolerance • Failure is defined as a certain percentage of the resources used for the application is shut down • Clouds would be able to replace these resources automatically • Measure the ratio between WIPS in RT and Issued WIPS Failure

  15. Goals of CloudCmp • Provide performance and cost information about various cloud providers • Help a provider identify its under-performing services compared to its competitors. • Provide a fair comparison • Characterizing all providers using the same set of workloads and metrics • Skip specialized services that only a few providers offer

  16. Goals of CloudCmp • Reduce measurement overhead and monetary costs • periodically measure each provider at different times of day across all its locations • Comply with cloud providers’ use policies • Cover a representative set of cloud providers

  17. Method: Select Provider • Amazon AWS • Microsoft Azure • Google AppEngine • Rackspace CloudServers

  18. Method: Identify core functionality • Elastic compute cluster • The cluster includes a variable number of virtual instances that run application code. • Persistent storage • The storage service keeps the state and data of an application and can be accessed by application instances through API calls. • Intra-cloud network • The intra-cloud network connects application instances with each other and with shared services. • Wide-area network. • The content of an application is delivered to end users through the wide-area network from multiple data centers (DCs) at different geographical locations.

  19. Method: Identify core functionality • Services offered by the providers

  20. Method: Choose Performance Metrics • Elastic compute cluster • Provides virtual instances that host and run a customer’s application code. • Is charged per usage: • IaaS: Time of an instance remains allocated • PaaS: CPU cycles consumes • Elastic: can dynamically scale up and down the number of instances

  21. Method: Choose Performance Metrics • Metrics to compare Elastic compute cluster • Benchmark finishing time • how long the instance takes to complete the benchmark tasks • Scaling latency • time taken by a provider to allocate a new instance after a customer requests it • Cost per benchmark • cost to complete each benchmark task

  22. Method: Choose Performance Metrics • Persistent Storage • Three common types: table, blob, and queue • Tablestorage is designed to store structural data like conventional databases • Blob storage is designed to store unstructured blobs such as binary objects • Queuestorage implements a global message queue to pass messages between different instances • Two pricing models: • Based on CPU cycles consumed by an operation • Fixed cost per operation

  23. Method: Choose Performance Metrics • Metrics to compare Persistent Storage • Operation response time • Time for a storage operation to finish • Time to consistency • time between when data is written to the storage service and when all reads for the data return consistent and valid results • Cost per operation

  24. Method: Choose Performance Metrics • Intra-cloud Network • Connects a customer’s instances among themselves and with the shared services offered by a cloud • None of the providers charge for traffic within their data centers. Inter-datacenter traffic is charged based on the amount of data • Metrics to compare Intra-cloud Network • Path capacity: TCP throughput • Latency

  25. Method: Choose Performance Metrics • Wide-area Network • The collection of network paths between a cloud’s data centers and external hosts on the Internet • Metrics to compare Wide-area Network • optimal wide-area network latency • The minimum latency between testers’ nodes and any data center owned by a provider

  26. Implementation: Computation Metrics • Benchmark tasks • A modified set of Java-based benchmark tasks from SPECjvm2008 that satisfies constraints of all the providers • Metrics: Benchmark finishing time • Run the benchmark tasks on each of the virtual instance types provided by the clouds, and measure their finishing time • Run instances of the same benchmark task in multiple threads to test multi-threading performance

  27. Implementation: Computation Metrics • Metrics: Cost per benchmark • Multiply the published per hour price by finishing time for those who charge by time • Using billing API for those who charge by CPU cycle. • Metrics: Scaling latency • Repeatedly request new instances and record the requesttime and available time • Divide the latency into two segments to locate the performance bottleneck • Provisioning latency: Request time to powered-on time • Booting latency: Powered-on time to available time

  28. Implementation: Storage Metrics • Benchmark tasks • Use Java-based client to test API to get, put or query data from the service • Non-Java-based clients are also tested • Mimic streaming workload to avoid the potential impact of memory or disk bottlenecks at the client’s side

  29. Implementation: Storage Metrics • Metrics: Response time • The time from when the client instance begins the operation to when the last byte reaches the client • Metrics: Throughput • The maximum rate that a client instance obtains from the storage service

  30. Implementation: Storage Metrics • Metrics: Time to Consistency • Write an object to a storage service, then repeatedly read the object and measure how long it takes before correct result is returned • Metrics: Cost per operation • Via billing API

  31. Implementation: Network Metrics • Metrics: Intra-cloud Network Throughput and Latency • Allocate a pair of instances in the same or different data centers, run standard tools such as iperfand ping between the two instances • Metrics: Optimal Wide-area Network Latency • Run an instance in each data center owned by the provider and ping these instances from over 200 nodes on PlanetLab(a group of computers available as a testbed for computer networking and distributed systems research). Record the smallest RTT • For AppEngine: collect the IP addresses of the instance from each of the PlanetLab nodes. Ping all of these IP addresses from each of the PlanetLabnodes

  32. Results • Anonymizethe identities of the providers in our results, and refer to them as C1 to C4 (But it is easy to see that: C1 – AWS, C2 – Rackspace, C3 – AppEngine, C4 – Microsoft Azure) • Test all instance types offered by C2 and C4, and the general-purpose instances from C1 • Refers to instance types as provider.i, i denotes the tier of service • Compare instances from both Linux and Windows for experiments depend on the type of OS. Test Linux instances for others

  33. Results • Cloud instances tested

  34. Results: Elastic Compute Cluster • Metrics: Benchmark finishing time

  35. Results: Elastic Compute Cluster • Price-comparable instances offered by different providers have widely different CPU and memory performance. • The instance types appear to be constructed in different ways. • For C1, the high-end instances may have faster CPUs • For C4, all instances might share the same type of physical CPU • C2 may be lightly loaded during the test • The disk I/O intensive task exhibits high variation on some C1 and C4 instances, probably due to interference from other colocated instances

  36. Results: Elastic Compute Cluster • Metrics: Performance at Cost

  37. Results: Elastic Compute Cluster • For single-threaded tests, the smallest instances of most providers are the most cost-effective • For multi-threaded tests, the high-end instances are not more cost-effective than the low-end ones. • The prices of high-end instances are proportional to the number of CPU cores • Bounded by memory bus and I/O bandwidth • For parallel applications it might be more cost-effective to use more low-end instances

  38. Results: Elastic Compute Cluster • Metrics: Scaling Latency

  39. Results: Elastic Compute Cluster • Metrics: Scaling Latency • All cloud providers can allocate new instances quickly with the average scaling latency below 10 minutes • Windows instances appear to take longer time to create than Linux ones • For C1, Windows ones have larger booting latency, possibly due to slower CPUs • For C2, provisioning latency of the Windows instances is much larger. It is likely that C2 may have different infrastructures to provision Linux and Windows instances.

  40. Results: Persistent Storage • Table Storage • Test the performance of three operations: get, put, and query • Each operation runs against two pre-defined data tables: a small one with 1K entries, and a large one with 100K entries • Repeat each operation several hundred times • C2 is not tested because it does not provide a table service

  41. Results: Persistent Storage • Table Storage: Response Time • The three services perform similarly for both get and put operations • For the query operation, C1 appears to have a better indexing strategy • None of the services show noticeable performance degradation in multiple concurrent operations

  42. Results: Persistent Storage • Table Storage: Time to Consistency • 40% of the get operations in C1 see inconsistency when triggered right after a put, Other providers exhibit no such inconsistency • C1 does provide an API option to request strong consistency but disables it by default.

  43. Results: Persistent Storage • Table Storage: Cost per operation • Both C1 and C3 charge lower cost for get/put than query • C4 charges the same across operations and can improve its charging model by accounting for the complexity of the operation.

  44. Results: Persistent Storage • Blob Storage: Response time

  45. Results: Persistent Storage • Blob Storage: Response time • Blobs of different sizes may stress different bottlenecks. The latency for small blobs can be dominated by one-off costs whereas that for large blobs can be determined by service throughput, network bandwidth, or client-side contention. • C2’s store may be tuned for read heavy workload.

  46. Results: Persistent Storage • Blob Storage: Response time of multiple concurrent operations • C1 and C4’s blob service throughput is well-tuned for multiple concurrent operations

  47. Results: Persistent Storage • Blob Storage: Maximum Throughput • C1 and C2’s blob service throughput is close to their intra-datacenter network bandwidth • C4’s blob service throughput of a large instancealso corresponds to TCP throughput inside its datacenter, and may not be constrained by instance itself.

  48. Results: Persistent Storage • Blob Storage: Cost per operation • The charging models are similar for all three providers and are based on the number of operations and the size of the blob. • No differences

  49. Results: Persistent Storage • Queue Storage: Response time • Message size is 50 Byte

  50. Results: Persistent Storage • Queue Storage • No significant performance degradation is found in sending up to 32 concurrent messages • Response time of the queue service is on the same order of magnitude as that of the table and blob services. • Both services charge similarly–1 cent per 10K operations

More Related