250 likes | 288 Views
Performance Measurement. Performance Analysis. Paper and pencil. Don’t need a working computer program or even a computer. Some Uses Of Performance Analysis. determine practicality of algorithm predict run time on large instance
E N D
Performance Analysis Paper and pencil. Don’t need a working computer program or even a computer.
Some Uses Of Performance Analysis determine practicality of algorithm predict run time on large instance compare 2 algorithms that have different asymptotic complexity e.g.,O(n) and O(n2)
Limitations of Analysis Doesn’t account for constant factors. but constant factor may dominate 1000n vs n2 and we are interested only in n < 1000
Limitations of Analysis Modern computers have a hierarchical memory organization with different access time for memory at different levels of the hierarchy.
8-32 1C 32KB 2C ~512KB 10C ~512MB 100C Memory Hierarchy MAIN L2 L1 ALU R
Limitations of Analysis Our analysis doesn’t account for this difference in memory access times. Programs that do more work may take less time than those that do less work.
Cache-aware and Cache-oblivious algorithms* • Cache-aware algorithms try to minimize “cache misses” • Example: Matrix multiplication • Cache-oblivious algorithms try to take advantage of cache without knowledge of hardware details
Performance Measurement/Benchmarking Measure actual time on an actual computer. What do we need?
Performance Measurement Needs programming language e.g. C++/Java working program Insertion Sort computer compiler and options to use gcc –O2
timing mechanism --- clock Performance Measurement Needs • data to use for measurement worst-case data best-case data average-case data
Timing In C++ double clocksPerMillis = double(CLOCKS_PER_SEC) / 1000; // clock ticks per millisecond clock_t startTime = clock(); // code to be timed comes here double elapsedMillis = (clock() – startTime) / clocksPerMillis; // elapsed time in milliseconds
Shortcoming Clock accuracy assume 100 ticks Repeat work many times to bring total time to be >= 1000 ticks
More Accurate Timing clock_t startTime = clock(); long numberOfRepetitions; do { numberOfRepetitions++; doSomething(); } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis; double timeForCode = elapsedMillis/numberOfRepetitions;
Bad Way To Time do { counter++; startTime = clock(); doSomething(); elapsedTime += clock() - startTime; } while (elapsedTime < 1000)
Accuracy Now accuracy is 10%. first reading may be just about to change to startTime + 100 second reading (final value of clock())may have just changed to finishTime so finishTime - startTime is off by 100 ticks
Accuracy first reading may have just changed to startTime second reading may be about to change to finishTime + 100 so finishTime - startTime is off by 100 ticks
Accuracy Examining remaining cases, we get trueElapsedTime = finishTime - startTime +- 100 ticks To ensure 10% accuracy, require elapsedTime = finishTime – startTime >= 1000 ticks
Timing in Java • Same semantics, slightly different syntax long startTime = System.nanoTime(); //Code to be measured System.out.println( "Elapsed time: " + (System.nanoTime() - startTime)); or Use micro-benmarking frameworks such as JMH
What Went Wrong? clock_t startTime = clock(); long numberOfRepetitions; do { numberOfRepetitions++; insertionSort(a, n); } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis; double timeForCode = elapsedMillis/numberOfRepetitions;
The Fix clock_t startTime = clock(); long numberOfRepetitions; do { numberOfRepetitions++; // put code to initialize a[] here insertionSort(a, n) } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis;
Time Shared System UNIX time MyProgram
To-do • 1. C++: • http://www.cplusplus.com/reference/ctime/clock/ • 1’.Java (JMH): • http://www.oracle.com/technetwork/articles/java/architect-benchmarking-2266277.html • http://nitschinger.at/Using-JMH-for-Java-Microbenchmarking/ • 2. Linux: • http://stackoverflow.com/a/556411