110 likes | 421 Views
Test Code. Serial and pthreaded stencil codesPro: Codes are well understoodCon: Codes may be too smallFull apps may have been better test. Windows VTune Pros. Analyzes executable directlyNo extra recompiling / instrumentation neededLanguage / compiler independentGUI helps on several frontsEas
E N D
1. Intel’s Vtune Performance Analyzer (for Windows) Assessment Kaushik Datta
December 5, 2007
2. Test Code Serial and pthreaded stencil codes
Pro: Codes are well understood
Con: Codes may be too small
Full apps may have been better test
3. Windows VTune Pros Analyzes executable directly
No extra recompiling / instrumentation needed
Language / compiler independent
GUI helps on several fronts
Easier to use
Graphical perf results displayed instantly
Supports Intel quad-core processors
However, I didn’t test this
4. Windows VTune Cons No hooks to specify what code segments to profile
Too many other processes also profiled
Possibly useful for full apps, but not kernels
Insufficient perf data about your process
Some counter values seemed wrong
Perf counters are only sampled
continuous data collection might be available
“Thread Profiler” tool didn’t seem to profile my kernel’s spawned threads
No direct flop counts
5. Available Events on Windows ClockTicks
Instructions Retired
MMX / x87 / SIMD Instructions Retired (but no direct flop counts found)
Loads/Stores Retired
Cache Read Hits / Misses
TLB Load / Store Misses
Calls / Conditionals / Branches
Bus Accesses
6. Call Graph
7. Time Breakdown
8. Counter Monitors
9. Counter Sampling