210 likes | 613 Views
Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan , Russell Sears Yahoo! Research Presenter Duncan. Benchmarking Cloud Serving Systems with YCSB. Benchmarking vs Testing Any difference? My opinion Benchmarking: Performance
E N D
Benchmarking Cloud Serving Systems with YCSBBrian F. Cooper, Adam Silberstein, Erwin Tam, RaghuRamakrishnan, Russell SearsYahoo! ResearchPresenter Duncan
Benchmarking Cloud Serving Systems with YCSB • Benchmarking vs Testing • Any difference? • My opinion • Benchmarking: Performance • Testing: usability test, security test, performance etc…
Motivation • A lot of new systems in Cloud for data storage and management • MongoDB, MySQL, Asterix, etc.. • Tradeoff • E.g. Append update to a sequential disk-log • Good for write, bad for read • Synchronous replication • copies up to date, but high write latency • How to choose? • Use benchmark to model your scenario!
Evaluate Performance =? • Latency • Users don’t want to wait! • Throughput • Want to serve more requests! • Inherent tradeoff between latency and throughput • More requests => more resource contention=> higher latency
Which system is better? • “Typically application designers must decide on an acceptable latency, and provision enough servers to achieve the desired throughput” • achieve the desired latency and throughput with fewer servers. • Desired latency:0.1 sec, 100 request/sec • MongoDB, 10 server • Asterix DB, 15 server
What else to evaluate? • Cloud platform • Scalability • Good scalability=>performance proportional to # of servers • Elasticity • Good elasticity=>performance improvement with small disruption
A Short Summary • Evaluate performance = evaluate latency, throughput, scalability, elasticity • A better system= less machine to achieve the performance goal
YCSB • Data generator • Workload generator • YCSB client • Interface to communicate with DB
YCSB Data Generator • A table with F fields and N records • Each field => a random string • E.g. 1,000 byte records, F=10, 100 bytes per field
Workload Generator • Basic operations • Insert, update, read, scan • No join, aggregate etc. • Able to control the distributions of: • Which operation to perform • E.g. 0.95 read, 0.05 update, 0 scan => read-heavy workload • Which record to read or write • Uniform • Zipfian: some records are extremely popular • Latest: recent records are more popular
YCSB Client • A script • Use the script to run the benchmark • Workload parameter files • You can change the parameter • Java program • DB interface layer • You can implement the interface for your DB system
Experiments • Experiment Setup: • 6 servers • YCSB client on another server • Cassandra, HBase, MySQL, PNUTS • Update heavy, read heavy, read only, read latest, short range scan workload.
Future Work • Availability • Impact of failure on the system performance • Replication • Impact to performance when increase replication
4 criteria • Author’s 4 criteria for a good benchmark: • Relevance to application • Portability • Not just for 1 system! • Scalability • Not just for small system, small data! • simplicity
Reference • Benchmarking Cloud Serving Systems with YCSB, Brian F. Cooper, Adam Silberstein, Erwin Tam, RaghuRamakrishnan, Russell Sears, SOCC 10 • BG: A Benchmark to Evaluate Interactive Social Networking Actions, SumitaBarahmand, ShahramGhandeharizadeh, CIDR 13 • http://en.wikipedia.org/wiki/Software_testing • http://en.wikipedia.org/wiki/Benchmark_(computing)
Thank You! • Questions?
Why a new benchmark? • Most cloud systems do not have a SQL interface => hard to implement complex queries • Benchmark only for specific applications • TPC-W for E-commerce • TPC-C for apps that mange, sell, distribute product/service