CLOUD BENCHMARK SUITE GROUP H Project Mentor - Dr. I Ling Yen Panfeng Xue

CLOUD BENCHMARK SUITEGROUP H Project Mentor -Dr. I Ling YenPanfengXue • Submitted by: ABHISHEK PRABHAKAR RAORANE axr124730NEERAJAKSHA GANGARAJU nxg121330HAOXUAM GUO hxg122030ZEEL BIPIN SHAH zxs120830

YCSB Summary • YCSB benchmark is designed to provide tools for apples-to-apples comparison of different serving data stores. • One contribution of the benchmark is an extensible workload generator. • Benchmarked the performance of four cloud serving systems. • Different types of Workload are taken as parameters for benchmarking.

YCSB What is a Benchmark suite? What is YCSB? "Yahoo! Cloud Serving Benchmark"

Goals of YCSB • "Yahoo! Cloud Serving Benchmark" (YCSB) framework facilitate performance comparisons of the new generation of cloud data serving systems.

Cloud Data Serving System • Cassandra • HBase • Yahoo!’s PNUTS • Simple sharded MySQL

Cloud Data Serving System • Cassandra- Apache Cassandra is an open source distributed database management system. • HBase -HBase is an open source, non-relational, distributed database modeled after Google's BigTable. • Yahoo!’s PNUTS -a massively parallel and geographically distributed database system for Yahoo!’s web applications. • Simple sharded MySQL- assigns dedicated resources to each MySQL database “chunk”/partition.

Cloud Serving System Characteristics • Scale-out • Elasticity • High availability

Classification of Systems and Tradeoffs • Read performance versus write performance. • Latency versus durability. • Synchronous versus asynchronous replication. • Data partitioning.

BENCHMARK TIERS • Tier 1 – Performance – For constant hardware, increase offered throughput until saturation. – Measure resulting latency/throughput curve. –Workload generator.

BENCHMARK TIERS- cont. • Tier 2 – Scalability – Scale up – Increase hardware, data size and workload proportionally. Measure latency; should be constant. – Elastic speedup – Run workload against N servers, while workload is running at N+1th server; measure timeseriesof latencies (should drop after adding server).

BENCHMARK TIERS- cont. • Tier 3—Availability -A cloud database must be highly available despite failures. -Availability tier measures the impact of failures on the system.

BENCHMARK TIERS- cont. • Tier 4—Replication -Performance cost or benefit -Availability cost or benefit -Freshness -Wide area performance

Benchmark Tools • Architecture :

Workloads • Workload – particular combination of workload parameters, defining one workload – Defines read/write mix, request distribution, record size, … – Two ways to define workloads: • Adjust parameters to an existing workload . • Define a new kind of workload . • Experiment – running a particular workload on a particular hardware setup to produce a single graph for 1 or N systems – Example – vary throughput and measure latency while running a workload against PNUTS and HBase

Results • Workload A—Update Heavy

Results-Cont. • Workload B—Read Heavy

Results-Cont. • Workload E—Short Ranges

Other Workloads • Workload C -Similar to Workload B -PNUTS and sharded MySQL achieved the lowest latency and highest throughput for the read operations. • Workload D -Similar to Workload B -PNUTS and MySQL most efficient for reads.

Elasticity

CONCLUSIONS • YCSB benchmark is designed to provide tools for apples-to-apples comparison of different serving data stores. • One contribution of the benchmark is an extensible workload generator. • Benchmarked the performance of four cloud serving systems.

Biblography • Cassandra - A Decentralized Structured Storage System by AvinashLakshman ,Prashant Malik. • http://hadoop.apache.org/hbase/ • http://wiki.apache.org/hadoop/Hbase • PNUTS: Yahoo!’s Hosted Data Serving Platform • Brian F. Cooper, Raghu Ramakrishnan, UtkarshSrivastava, Adam Silberstein,PhilipBohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and RamanaYerneni • Yahoo! Research.

BRIEF SUMMARY CloudStone Web 2.0 Implementation CloudFlex Load Balancer Controller Implementation

REFERENCES • Cloudstone: Multi-Platform, Multi-Language Benchmark and Measurement Tools for Web 2.0 • Will Sobel, Shanti Subramanyam, AkaraSucharitakul, Jimmy Nguyen, Hubert Wong, SheetalPatil, Armando Fox, David Patterson UC Berkeley and *Sun Microsystems • Cloudstone: http://radlab.cs.berkeley.edu/ • Tim O’Reilly. What is Web 2.0? http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html, Sept. 2005 • Daniel Menascé. Load Testing of Web Sites. IEEE Internet Computing 6(4), July/August 2002 • CloudFlex: Seamless Scaling of Enterprise Applications into the Cloud • YousukSeung, Terry Lam, Li Erran Li , Thomas Woo at INFOCOM, 2011 Proceedings IEEE DOI: 10.1109/INFCOM.2011.5935022 2011 , P: 211 – 215 • C. Kopparapu. Load Balancing Servers, Firewalls, and Caches. Wiley,2002. • A. AWS. Amazon elastic load balancing. http://aws.amazon.coml elasticloadbalancing

CLOUDSTONE-Multi-Platform, Multi-Language Benchmark and Measurement Tools for Web 2.0

INTRODUCTION • What is Cloudstone? • A benchmark for clouds designed to support Web 2.0 type applications. • For web server software stacks • Web 2.0 • Its demands • Web 2.0 workloads • One-to-many vs. many-to-many • User-contributed content • Richer user experience

CLOUDSTONE • CLOUDSTONE • Goal capture “typical” Web 2.0 functionality in a datacenter or cloud computing environment • caching and database tuning • testing and data collection • argue for any one development stack over another • CLOUDSTONE toolkit • Web 2.0 kit includes : PHP, Java EE and Rails • Used to run large experiments on cloud environment • Application-specific distributed Workload generator and data collector – FABAN

CLOUDSTONE • Olio, Rails & PHP versions of social events app • Events, users, comments, tag clouds, AJAX • representative “sound practices” on each stack • Faban, Markov-based workload generator • per-operation latency SLA’s • time-varying workloads, distributed operation • instrumentation: meet 90% or 99%ile for all per-op SLA’s over all 5-minute windows

CHALLENGES • Challenges of Web 2.0 Benchmarking • Database tuning • Stack-independent techniques • Stack-specific techniques • Deploying Additional Web & Application Servers Ruby-on-rails implementationPHP implementation

CHALLENGES • Caching • What is cached? • Where are cached objects stored? Concurrent Users/month • Max # concurrent users per dollar per month, while meeting 90% & 99% SLA response time • Report 1, 12, 36 months • captures cap-ex vs. op-ex depreciation, long-term contracts, etc. • Report log10(size of user base) • avoids unfair advantage for very large sites

Hardware Platforms • EC2: A single “extra-large compute instance” on EC2: • A 64-bit, x86 architecture platform with 15 GB RAM • 4 virtual cores with 2 EC2 Compute Units each (Amazon describes 1 Compute Unit as “the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or Xeon processor”) • 1.7TB of local storage. • N2: A Sun Niagara 2 enterprise server configured • 32GB RAM, 8 UltraSPARC T2 cores at 1.4 GHz with 8 hardware threads per core • 50 GB of local storage.

Hardware Platforms • 1 EC2 “compute unit”~1GHz Opteron • M1.XLarge: 15GB RAM, 8 CU (4 cores) • C1.XLarge: 7 GB RAM, 20 CU (8 faster cores) • Both cost 0.80 per instance-hour

IMPLEMENTATION • Number of users vs. number of server processes with all components shown on an EC2 “extra large” instance.

IMPLEMENTATION Number of users vs. number of server processes

CloudFlex -Seamless Scaling of Enterprise Applications into the Cloud • CloudFlex: Transparently taps cloud resources to serve application requests that exceed capacity of internal infrastructure. • A feedback control system with two key interacting components: • Load balancer • Controller

CloudFlex- System Architecture

DESIGN CHALLENGES • Choke points • A choke point refers to a dynamic performance bottleneck that can shift from one place to another as the system scales out. • Cloud resource responsiveness • delay between the request and the acquisition of a cloud resource • Load balancing

CloudFlex Algorithm • System setup • Tier1 & Tier2 Architecture • Application performance metric • System state modelling

CloudFlex: Scaling Algorithm

Implementation • Experimental setup • Cloudstone • virtualize the servers with Linux-based VMWareWorkstation and set up Amazon Virtual Private Cloud (VPC) • Maximum allowable load and • To determine the maximum allowable load for both internal and cloud physical servers • Faban load generator gradually increases the workload

Implementation • A Plot the cdf of service response time versus different workloads for internal and cloud workers. • Convergence of cloud users.

CONCLUSION • CloudStone is realised on Amazon Elastic Compute Cloud and Sun’s Niagara 2 enterprise server, discussing the challenges of comparing platforms or software stacks and how Cloudstone can help quantify the differences • CloudFlexenables enterprise to seamlessly use both internal and cloud resources for handling application requests • Addresses issues with system choke points, cloud resource responsiveness, and load balancing.

Cloud Computing for HPC

Introduction • Whether cloud computing has its value in HPC computing? • How to evaluate? • By running benchmarks and scientific applications on public clouds. • NPB, HPL, CSFV • Public clouds: Amazon EC2 cloud, GoGrid cloud, IBM cloud • Pros and Cons • Pros: cheap, no upfront investment, easy to customize and resize • Cons: extra overhead of VM, poor network capability, lower efficiency • In the project, I will study through the performance of HPL benchmark running on the clusters with Hadoop framework.

Introduction contd. • References: • [1] Q. He, S. Zhou, B. Kobler, D. Duffy, and T. McGlynn, “Case study for running HPC applications in public clouds,” ACM International Symposium on High Performance Distributed Computing, Chicago, Illinois, 2010. • [2] Jeff. Napper, et. al. , “Can cloud computing reach the top500?”, 2009. • [3] M. Armbrust, A. Fox, R. Grith, A. D. Joseph, R. H.Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, and M. Zaharia. Above the clouds: A berkeleyview of cloud computing. Technical report, UC Berkeley, 2009. • [4] Ed. Walker, “Benchmarking Amazon EC2 for highperformance scientific computing” , 2008. • [5] http://www.netlib.org/benchmark/hpl/index.html • [6] http://hpl-calculator.sourceforge.net/

Before the test • Compared with supercomputers dedicated for HPC, Public Clouds have these characteristics: • Currently, the main users of public clouds are web applications and services. They do not need large amount of numerical calculations, especially linear algebra calculations. And they do not have the requirement of extremely fast interconnects. Instead, they are composed primarily of database queries. • So the cloud vendors now have no optimizations for the intra-cluster networks. • The public clouds use virtual machine technology which may cause extra overhead. • Easy to customize and reallocate. No upfront investment. Great flexibility. • Different public cloud platforms in service: Amazon EC2, GoGrid, IBM

Results and Analysis • The single node instance test show little difference between dedicated supercomputer and public clouds. • This means that the virtual machine has little overhead for HPC.

Results and Analysis contd. • When coming to the multi-node instances, the network issue has great influence on the performance. • HPL on IBM and GoGrid, the Rpeak% (theoretical peak) decreases exponentially. This means that there is a severe loss in performance while number of nodes are increasing.

Results and Analysis contd. • For the full-size scientific application CSFV which is MPI-based • The best result of public clouds is GoGrid, which has the highest speed network in these 3, but still can not compared with the Infiniband of the supercomputer.

Conclusions • Virtual machine has no significant impact on HPC benchmarks/application. • The low intra-cluster network speed make public clouds can not compared with dedicated supercomputers with Infiniband. • It depends on the public cloud vendors that whether they consider HPC users as a group of potential customers. • These tests have some limitation that they are all running on relatively small number of nodes. • More tests will be run on other cloud platforms, e.g., NASA’s Nebula. • New scientific clouds such as DoE Magellan cloud will use Infiniband, Penguin Computing's HPC-As-A-Service will use either GigE or Infiniband. That said, there will be a promising trend for HPC-in-the-cloud.

Other Results and Conclusions • From Napper“Can cloud computing reach the top500?” • Running HPL on Amazon’s EC2 • The result shows that the performance does not scale due to network I/O limitations. • GFLOP/$ can be a better measurement for choosing among different cloud vendors.

CLOUD BENCHMARK SUITE GROUP H Project Mentor - Dr. I Ling Yen Panfeng Xue

CLOUD BENCHMARK SUITE GROUP H Project Mentor - Dr. I Ling Yen Panfeng Xue

Presentation Transcript

Xue Xue Institute

The PARSEC Benchmark Suite

Design Suite Group Project

CBO Mentor Project

RNAsim/CRIMSON Algorithm Benchmark Suite

The HPC Challenge (HPCC) Benchmark Suite

BENCHMARK SUITE

The HPC Challenge (HPCC) Benchmark Suite

The HPEC Challenge Benchmark Suite

Project Sponsor / Advisor / Mentor: Dr. Daniel Phillips

Dr Ling Ma

Dr. Yu-Ling Cheng

Dr I rawati H arsono

The Benchmark Project

Benchmark Suite for Web Services

SPEC OMP Benchmark Suite

Project Sponsor / Advisor / Mentor: Dr. Daniel Phillips

Dr Ling Ma

HPCS HPCchallenge Benchmark Suite

Biology Benchmark I

The HPC Challenge (HPCC) Benchmark Suite