1 / 6

Should We Dump Flop/s?

Should We Dump Flop/s?. David H Bailey Lawrence Berkeley National Laboratory, USA This talk is available at: http://crd.lbl.gov/~dhbailey/dhbtalks/flops.pdf. Using Flop/s As A Metric for Performance. Advantages:

ora
Download Presentation

Should We Dump Flop/s?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Should We Dump Flop/s? David H Bailey Lawrence Berkeley National Laboratory, USA This talk is available at: http://crd.lbl.gov/~dhbailey/dhbtalks/flops.pdf

  2. Using Flop/s As A Metric for Performance Advantages: • Its usage is traditional and well-understood in the HPC community -- data is available for several decades of progress. • The flop count for a given algorithm or application is fairly well defined, although care has to be taken to avoid abuse -- i.e., we should base the flop count on the best practical serial algorithm. Disadvantages: • A focus on flop/s at the expense of other system parameters can lead to system designs that are poorly balanced for real workloads. • Using measured flop count (i.e. by a hardware performance monitor) may lead to perverse outcomes, such as inefficient algorithms that exhibit artificially high flop/s rates.

  3. Using Mop/s as a Performance Metric Advantages: • A focus on memory operations per second in comparing systems may result in systems better suitedfor many real-world scientific computation. Disadvantages: • There is NO objective system-independent way to assess the mop count for a given algorithm or architecture. • A focus on mop/s at the expense of other system parameters can lead to system designs that are poorly balanced for real workloads. • Using measured memory operation counts (i.e. by a hardware performance monitor) may lead to perverse outcomes, such as grossly cache-inefficient algorithms that exhibit artificially high mop/s rates.

  4. How Do We Define Mop Count for a Given Application? • The mop count is inextricably tried to the architecture. • Mop count can vary by a factor of 100 depending on how much cache is available. • Unit stride, constant-stride and random stride data are handled very differently from system to system. • Naive schemes to count mops for a given algorithm or implementation (ie number of flops performed x 3) reduce to using an inflated flop count as the metric. • One possibility: Using Erich Strohmaier’s APEX-map as the basis for the mop count -- it measures the distribution of the distance of one memory operation to the next. • But using APEX-map to perform these measurements is very expensive, and the resulting figure is highly one-dimensional.

  5. Bottom Line: Don’t Dump Flop/s • There is NO intrinsic memory operation count for a given algorithm or architecture. • Mop/s, if anything, has significantly more potential for abuse than flop/s. • Perhaps in the future someone can devise an architecture-independent metric to assess the “work done” in a large scientific application. • Until then, flop/s is the best we have.

More Related