1 / 15

Measuring Performance

Measuring Performance. How should the performance of a parallel computation be measured? Traditional measures like MIPS and MFLOPS really don’t cut it New ways to measure parallel performance are needed Speedup Efficiency. Speedup.

lklein
Download Presentation

Measuring Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measuring Performance • How should the performance of a parallel computation be measured? • Traditional measures like MIPS and MFLOPS really don’t cut it • New ways to measure parallel performance are needed • Speedup • Efficiency ICSS531 - Speedup

  2. Speedup • Speedup is the most often used measure of parallel performance • If • Ts is the best possible serial time • Tn is the time taken by a parallel algorithm on n processors • Then ICSS531 - Speedup

  3. Read Between the Lines • Exactly what is meant by Ts (i.e. the time taken to run the fastest serial algorithm on one processor) • One processor of the parallel computer? • The fastest serial machine available? • A parallel algorithm run on a single processor? • Is the serial algorithm the best one? • To keep things fair, Ts should be the best possible time in the serial world ICSS531 - Speedup

  4. Speedup’ • A slightly different definition of speedup also exists. • The time taken by the parallel algorithm on one processor divided by the time taken by the parallel algorithm on N processors. • However this is misleading since many parallel algorithms contain extra operations to accommodate the parallelism (e.g the communication) • The result is Ts is increased thus exaggerating the speedup. ICSS531 - Speedup

  5. Factors That Limit Speedup • Software Overhead • Even with a completely equivalent algorithm, software overhead arises in the concurrent implementation • Load Balancing • Speedup is generally limited by the speed of the slowest node. So an important consideration is to ensure that each node performs the same amount of work • Communication Overhead • Assuming that communication and calculation cannot be overlapped, then any time spent communicating the data between processors directly degrades the speedup ICSS531 - Speedup

  6. Linear Speedup • Which ever definition is used the ideal is to produce linear speedup • A speedup of N using N processors • However in practice the speedup is reduced from its ideal value of N • Superlinear speedup results when • unfair values are used for Ts • Differences in the nature of the hardware used ICSS531 - Speedup

  7. Speedup Curves Superlinear Speedup Linear Speedup Speedup Typical Speedup Number of Processors ICSS531 - Speedup

  8. Efficiency • Speed up does not measure how efficiently the processors are being used • Is it worth using 100 processors to get a speedup of 2? • Efficiency is defined as the ratio of the speedup and the number of processors required to achieve it • The efficiency is bounded from above by 1 ICSS531 - Speedup

  9. Example ICSS531 - Speedup

  10. Speedup Curve ICSS531 - Speedup

  11. Amdahl’s Law • A parallel computation has two types of operations • Those which must be executed in serial • Those which can be executed in parallel • Amdahl’s law states that the speedup of a parallel algorithm is effectively limited by the number of operations which must be performed sequentially ICSS531 - Speedup

  12. Amdahl’s Law • Let the time taken to do the serial calculations be some fraction σ of the total time ( 0 < σ  1) • The parallelizable portion is 1- σ of the total • Assuming linear speedup • Tserial = σT1 • Tparallel = (1- σ)T1/N • By substitution ICSS531 - Speedup

  13. Consequences of Amdahl’s • Say we have a program containing 100 operations each of which take 1 time unit. • Suppose σ=.2, using 80 processors • Speedup = 100 / (20 + 80/80) = 100 / 21 < 5 • A speedup of only 5 is possible no matter how many processors are available • So why bother with parallel computing? Just wait for a faster processor ICSS531 - Speedup

  14. Avoiding Amdahl • There are several ways to avoid Amdahl’s law • Concentrate on parallel algorithms with small serial components • Amdahl’s law is not complete in that it does not take into account problem size ICSS531 - Speedup

  15. Classifying Parallel Programs • Parallel programs can be placed into broad categories based on expected speedups • Trivial Parallel • Assumes complete parallelism with no overhead due to communication • Divide and Conquer • N log N speedup • Communication Bound Parallelism ICSS531 - Speedup

More Related