1 / 47

Advanced Systems Lab

Advanced Systems Lab. G. Alonso, D. Kossmann Systems Group http: // www.systems.ethz.ch. Reading. Read Chapter 4, 5, and 6 in the text book. Throughput and Response Time. Understanding Performance. Response Time critical path analysis in a task dependency graph

greg
Download Presentation

Advanced Systems Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Systems Lab G. Alonso, D. Kossmann Systems Group http://www.systems.ethz.ch

  2. Reading • Read Chapter 4, 5, and 6 in the text book

  3. Throughput and Response Time

  4. Understanding Performance • Response Time • criticalpathanalysis in a taskdependencygraph • „partition“ expensivetasksintosmallertasks • Throughput • queueingnetwork model analysis • „replicate“ resources at bottleneck

  5. Response Times msecs #servers

  6. Whyareresponsetimeslong? • Becauseoperationstakelong • cannottravelfasterthan light • delayseven in „single-user“ mode • possibly, „parallelize“ long-runningoperations • „intra-requestparallelism“ • Becausethereis a bottleneck • contention of concurrentrequests on a resource • requestswait in queuebeforeresourceavailable • addresources to parallelizerequests at bottleneck • „inter-requestparallelism“

  7. CriticalPath • Directed Graph of Tasks and Dependencies • response time = max { length of path } • assumptions: no resourcecontention, no pipelining, ... • Whichtaskswouldyoutry to optimizehere? Start A (3ms) C (9ms) B (1ms) End

  8. QueueingNetwork Models • Graph of resources and flow of requests • Bottleneck=Resourcedefinestput of wholesystem • (analysistechniquesdescribedlater in thecourse) Start Server A (3 req/ms) Server C (20 req/ms) Server B (5 req/ms) End

  9. Forms of Parallelism • Inter-requestParallelism • severalrequestshandled at thesame time • principle: replicateresources • e.g., ATMs • (Independent) Intra-requestParallelism • principle: divide & conquer • e.g., printpieces of document on severalprinters • Pipelining • each „item“ isprocessedbyseveralresources • process „items“ at different resources in parallel • canlead to both inter- & intra-requestparallelism

  10. Inter-requestParallelism Req 1 Resp. 1 Resp. 2 Resp. 3

  11. Intra-requestParallelism Req 1 split Req 1.1 Req 1.2 Req 1.3 Res 1.1 Res 1.2 Res 1.3 merge Response 1

  12. Pipelining (Intra-request) Req 1 split Req 1.1 merge Response 1

  13. Speed-up • Metricforintra-requestparallelization • Goal: test ability of SUT to reduceresponse time • measureresponse time with 1 resource • measureresponse time with N resources • SpeedUp(N) = RT(1) / RT(N) • Ideal • SpeedUp(N) is a linear function • canyouimagine super-linear speed-ups?

  14. Speed Up #servers

  15. Scale-up • Test how SUT scaleswithsize of theproblem • measureresponse time with 1 server, unitproblem • measureresponse time with N servers, N unitsproblem • ScaleUp(N) = RT(1) / RT(N) • Ideal • ScaleUp(N) is a constantfunction • Canyouimagine super scale-up?

  16. Scale Up Exp.: Response Time msecs #servers

  17. Scale Out • Test how SUT behaveswithincreasingload • measurethroughput: 1 server, 1 user • measurethroughput: N servers, N users • ScaleOut(N) = Tput(1) / Tput(N) • Ideal • Scale-OutshouldbehavelikeScale-Up • (oftentermsareusedinterchangeably; butworth-while to noticethedifferences) • Scale-out and down in Cloud Computing • theability of a system to adapt to changes in load • oftenmeasured in $ (or at least involvingcost)

  18. Whyisspeed-up sub-linear? Req 1 split Req 1.1 Req 1.2 Req 1.3 Res 1.1 Res 1.2 Res 1.3 merge Response 1

  19. Whyisspeed-up sub-linear? • Costfor „split“ and „merge“ operation • thosecanbeexpensiveoperations • try to parallelizethem, too • Interference: serversneed to synchronize • e.g., CPUs accessdatafromsamedisk at same time • shared-nothingarchitecture • Skew: worknot „split“ intoequal-sizedchunks • e.g., somepiecesmuchbiggerthanothers • keepstatistics and plan better

  20. Summary • Improve Response Times by „partitioning“ • divide & conquerapproach • Works well in manysystems • ImproveThroughputbyrelaxing „bottleneck“ • addresources at bottleneck • Fundamental limitations to scalability • resourcecontention (e.g., lock conflicts in DB) • skew and poorloadbalancing • Special kinds of experimentsforscalability • speed-up and scale-upexperiments

  21. Metrics and workloads

  22. Metrics and Workloads • Definingmoreterms • Workload • Parameters • ... • ExampleBenchmarks • TPC-H, etc. • Learnmoremetrics and traps

  23. Ingredients of an Experiment (rev.) • System(s) Under Test • The (real) systemswewouldlike to explore • Workload(s) = User model • Typicalbehavior of users / clients of thesystem • Parameters • The „itdepends“ part of theanswer to a perf. question • System parameters vs. Workloadparameters • Test database(s) • For databaseworkloads • Metrics • Definingwhat „better“ means: speed, cost, availability, ...

  24. System under Test • Characterizedbyits API (services) • set of functionswithparameters and resulttypes • Characterizedby a set of parameters • Hardware characteristics • E.g., networkbandwidth, number of cores, ... • Software characteristics • E.g., consistencylevelfor a databasesystem • Observableoutcomes • Droppedrequests, latency, systemutilization, ... • (results of requests / API calls)

  25. Workload • A sequence of requests (i.e., API/servicecalls) • Includingparametersettings of calls • Possibly, correlationbetweenrequests (e.g., sessions) • Possibly, requestsfrom different geographiclocations • Workloadgenerators • Simulate a clientwhichissues a sequence of requests • Specify a „thinking time“ orarrival rate of requests • Specify a distributionforparametersettings of requests • Open vs. Closed System • Number of „active“ requestsis a constantorbound • Closedsystem = fixed #clients, eachclient 0,1 pendingreq. • Warning: Oftenmodel a closedsystemwithoutknowing!

  26. Closed system • Load comes from a limited set of clients • Clients wait for response before sending next request • Load is self-adjusting • System tends to stability • Example: database with local clients

  27. Open system • Load comes from a potentially unlimited set of clients • Load is not limited by clients waiting • Load is not self-adjusting (load keeps coming even if SUT stops) • Tests system’s stability • Example: web server

  28. Parameters • Manysystem and workloadparameters • E.g., size of cache, locality of requests, ... • Challenge is to find the ones that matter • understandingthesystem + commonsense • Computethestandarddeviation of metric(s) whenvarying a parameter • iflow, theparameterisnotsignificant • if high, theparameterissignificant • importantareparameterswhichgenerate „cross-overpoints“ between System A and B whenvaried. • Carefulaboutcorrelations: varycombinations of params

  29. Test Database • Manysystemsinvolve „state“ • Behaviordepends on state of database • E.g., longresponsetimesforbigdatabases • Database is a „workloadparameter“ • Butverycomplex • And withcompleximplications • Criticaldecisions • Distribution of values in thedatabase • Size of database (performancewhengenerating DB) • Ref.: J. Gray et al.: SIGMOD 1994.

  30. PopularDistributions • Uniform • Choose a range of values • Eachvalue of rangeischosenwiththesame prob. • Zipf (self-similarity) • Frequency of valueisinverse proportional to rank • F(V[1]) ~ 2 x F(V[2]) ~ 4x F(V[4]) ... • Skewcanbecontrolledby a parameterz • Default: z=1; uniform: z=0; high zcorresponds to high skew • Independent vs. Correlations • In reality, thevalues of 2 (ormore) dim. arecorrelated • E.g., peoplewhoare good in mathare good in physics • E.g., a carwhichis good in speedis bad in price

  31. Independent Correlated Anti-correlated Multi-dimensional Distributions Ref.: Börszönyi et al.: „TheSkyline Operator“, ICDE 2001.

  32. Metrics • Performance; e.g., • Throughput (successfulrequests per second) • Bandwidth (bits per second) • Latency / Response Time • Cost; e.g., • Cost per request • Investment • Fix cost • Availability; e.g., • Yearlydowntime of a singleclient vs. wholesystem • % droppedrequests (orpackets)

  33. Metrics • How to aggregatemillions of measurements • classic: median + standarddeviation • Whyis median betterthanaverage? • Whyisstandarddeviation so important? • Percentiles (quantiles) • V = Xthpercentileif X% of measurmentsare < V • Max ~ 100th percentile; Min ~ 0th percentile • Median ~ 50th percentile • Percentiles good fit for Service Level Agreements • Mode: Most frequent (probable) value • Whenisthe mode the best metric? (Give an example)

  34. PercentileExample

  35. AmazonExample (~2004) • Amazon lost about 1% of shoppingbaskets • Acceptablebecauseincrementalcost of IT infrastructure to secure all shoppingbasketsmuchhigherthan 1% of therevenue • Someday, somebodydiscoveredthatthey lost the *largest* 1% of theshoppingbaskets • Not okay becausethosearethepremiumcustomers and theynever come back • Result in muchmorethan 1% of therevenue • Be carefulwithcorrelationswithinresults!!!

  36. Wheredoes all this come from? • Real workloads • Usetracesfromexisting (production) system • Use real databasesfromproductionsystem • Syntheticworkloads • Usestandardbenchmark • Inventsomethingyourself • Tradeoffs • Real workloadisalways relevant • Syntheticworkload good to study „cornercases“ • Makesitpossible to vary all „workloadparameters“ • Ifpossible, useboth!

  37. Benchmarks • Specifythewholeexperimentexcept SUT • Sometimesspecifysettings of „systemparameters“ • E.g., configure DBMS to run at isolationlevel 3 • Designedfor „is System A betterthan B“ questions • Report oneortwonumbers as metricsonly • Usecomplexformula to computethesenumbers • Zero oroneworkloadparametersonly • Standardization and notaries to publishresults • Misusedbyresearch and industry • Implementonly a subset • Inventnewmetrics and workloadparameters • Violation of „system parameter“ settings and fineprint

  38. Benchmarks: Good, bad, and ugly • Good • Helpdefine a field: giveengineers a goal • Great formarketing and salespeople • Even ifmisused, greattoolforresearch and teaching • Bad • Benchmarkwarsarenotproductive • Misleadingresults – hugedamageif irrelevant • Ugly • Expensive to becompliant (legal fineprint) • Irreproducibleresultsdue to complexconfigurations • Vendorshavecomplexlicenseagreements (DeWittclause) • Single numberresultfavors „elephants“ • Difficult to demonstrateadvantages in the „niche“

  39. Benchmarks • Conjecture „Benchmarks are a series of tests in order to obtain prearranged results not available on competitive systems.“ (S. Kelly-Bootle) • Corollary „I only trust statistics that I have invented myself.“(folklore)

  40. ExampleBenchmarks • CPU • E.g., „g++“, Ackermann, SPECint • Databases (www.tpc.org) • E.g., TPC-C, TPC-E, TPC-H, TPC-W, ... • Parallel Systems • NAS Parallel Benchmark, Splash-2 • Other • E.g., CloudStone, LinearRoad • Microbenchmarks • E.g., LMBench

  41. SPECint • Goal: Study CPU speed of different hardware • SPEC = Standard Performance Eval. Corp. • www.spec.org • Long history of CPU benchmarks • First version CPU92 • Currentversion: SPECint2006 • SPECint2006 involves 12 tests (all in C/C++) • perlbench, gcc, bzip2, ..., xalancbmk • Metrics • Comparerunning time to „referencemachine“ • E.g., 2000secs vs. 8000secs forgccgivesscore of 4 • Overall score = geometricmean of all 12 scores

  42. SPECintResults • Visit: http://www.spec.org/cpu2006/results/cint2006.html

  43. TPC-H Benchmark • Goal: Evaluate DBMS + hardwarefor OLAP • Find the „fastest“ systemfor a given DB size • Find the best „speed / $“ systemfor a givensize • See to which DB sizesthesystemsscale • TPC-H models a company • Orders, Lineitems, Customers, Products, Regions, ... • TPC-H specifiesthefollowingcomponents • Dbgen: DB generatorwith different scalingfactors • Scalingfactor of DB istheonlyworkloadparameter • Mix of 22 queries and 2 update functions • Executioninstructions and metrics

  44. TPC-H Fineprint • Physical Design • E.g., youmustnotverticallypartition DB • (manyresultsviolatethat, i.e., all columnstores) • Executionrules • Specifyexactlyhow to executequeries and updates • Specifiyexactlywhich SQL variantsareallowed • Results • Specifiesexactlyhow to computemetrics and how to publishresults • Thespecificationisabout 150 pageslong (!)

  45. TPC-H Results • Visit: http://www.tpc.org/tpch/results/tpch_price_perf_results.asp

  46. Microbenchmarks • Goal: „Understandfullbehavior of a system“ • Not good fordecision „System A vs. System B“ • Good forcomponenttests and unittests • Design Principles • Manysmall and simple experiments, manyworkloadparameters • Report all results (ratherthanonebignumber) • Eachexperimenttests a different feature (service) • E.g., tablescan, indexscan, joinfor DB • E.g., specificfunctioncalls, representativeparametersettings • Isolatethisfeature as much as possible • Design requiresknowledge of internals of SUT • Designedfor a specificstudy, benchmarknotreusable

  47. How to improveperformance? • Find bottleneck • Throw additional resources at the bottleneck • Find the new bottleneck • Throw additional resources at the bottleneck • ...

More Related