1 / 26

Initial Design of a Test Suite for Automatic Performance Analysis Tools

Initial Design of a Test Suite for (Automatic) Performance Analysis Tools. Initial Design of a Test Suite for Automatic Performance Analysis Tools. Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany b.mohr@fz-juelich.de. Jesper Larsson Träff

redell
Download Presentation

Initial Design of a Test Suite for Automatic Performance Analysis Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Initial Design of a Test Suitefor (Automatic)Performance Analysis Tools Initial Design of a Test Suitefor Automatic Performance Analysis Tools Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany b.mohr@fz-juelich.de Jesper Larsson Träff NEC Europe Ltd. C&C Research Labs Germany traff@ccrl-nece.de

  2. IST Working Group APART (since 1999) Automatic Performance Analysis: Resources and Tools • Forum for scientists and vendors • About 20 partners in Europe and the U.S. • http://www.fz-juelich.de/apart/ • Current Automatic Performance Tools Projects • Askalon http://www.par.univie.ac.at/project/askalon/ • Kappa-Pi http://www.caos.uab.es/kpi.html • KOJAK http://www.fz-juelich.de/zam/kojak • Paradyn http://www.cs.wisc.edu/~paradyn/ • Peridot http://wwwbode.cs.tum.edu/~gerndt/peridot/

  3. (Full, Associated, and Former) Members • European Research Centers and Universities • U.S. Research Centers and Universities • Vendors

  4. APART Terminologie • Performance Property • Aspect of performance behavior of an application • E.g., communication dominated by waiting time • Specified as condition referring to performance data • Quantified and normalized in terms ofbehavior-independent metric (severity) • Performance Problem • Performance property with “negative” implications • Performance Bottleneck • Performance Problem with highest severity

  5. C SEND RECV Example: Performance Property “Message in Wrong Order” Location B SEND SEND wait RECV A Time

  6. The APART Test Suite (ATS) • Users rely on correct working of tools • Tools need to be especially well tested • Systematic approach needed • APART Test Suite • Common project inside APART group • Every member needs this  minimize resources • Ensures re-usability • Will also allow evaluation / comparison ofthe different member projects • Main focus:automatic performance analysis tools • But also useful for “regular” performance tools • http://www.fz-juelich.de/apart/ats/

  7. Desired Functionality • Tests to determine whether the semanticsof the original program were not altered • Tests to see whether the recordedperformance data is correct • Synthetic positive test cases for each known and definedperformance property and combinations of them • Negative test cases which have no known performance problem • “Real world” size parallel applications and benchmarks • Can be partially based on existing validation suites  WWW • Probably needs to be tool specific • Collect available benchmarks and applications  WWW • Design and Implementation of a ATS Framework

  8. Validation Suites and Kernel Benchmarks (I) Validation MPI test / validation suites from Intel, IBM, ANL • http://www-unix.mcs.anl.gov/mpi/mpi-test/tsuite.html MPI Benchmarks PARKBENCH (PARallel Kernels and BENCHmarks) • http://www.netlib.org/parkbench/ PMB - Pallas MPI Benchmarks • http://www.pallas.com/e/products/pmb/ SKaMPI (Special Karlsruher MPI – Benchmark) • http://liinwww.ira.uka.de/~skampi/

  9. Kernel Benchmarks (II) OpenMP Benchmarks EPCC OpenMP Microbenchmarks • http://www.epcc.ed.ac.uk/… research/openmpbench/openmp_index.html Hybrid Benchmarks The Los Alamos MicroBenchmarks Suite (LAMB) • MPI and multi threading ( Pthreads and OpenMP) programming models based on SKaMPI and EPCC

  10. “Real World” Applications and Benchmarks The NAS Parallel Benchmarks (NPB) • http://www.nas.nasa.gov/Software/NPB/ The ASCI Purple and Blue Benchmark Codes • http://www.llnl.gov/… asci/purple/benchmarks/limited/code_list.html… asci_benchmarks/asci/asci_code_list.html NCAR Benchmarks • http://www.scd.ucar.edu/css/software/bench/

  11. DISTRIBUTION df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() WORK do_work() Current Design of ATS Framework

  12. The Distribution Module • Distribution specified by • Distribution function • Distribution parameters • All distribution function have the same signature • double distr_func (int me, int size, double sf, distr_t* dd) • me, size: member me of group of size size • sf: scaling factor • dd: distribution parameter descriptor • returns value for me calculated based on me, size, and ddscaled by sf • ATS provides set of predefined distribution functions • Can easily extended if needed

  13. high high high high high high med med val low low low low low low Predefined Distribution Functions n same linear peak block2 cyclic2 block3 cyclic3

  14. MPI PROPERTIES OpenMP PROPERTIES MPI UTILS OpenMP UTILS par_do_mpi_work() alloc_mpi_buf() free_mpi_buf() alloc_mpi_vbuf() free_mpi_vbuf() mpi_commpattern_sendrecv() mpi_commpattern_shift() par_do_omp_work() DISTRIBUTION df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() WORK do_work() Current Design of ATS Framework

  15. Example: MPI Property Function late_sender void par_do_mpi_work(distr_func_tdf, distr_t*dd, MPI_Comm c) { int me, sz; MPI_Comm_rank(c, &me); MPI_Comm_size(c, &sz); do_work(df(me, sz, 1.0, dd)); } void late_sender(double bwork, double ework, int r, MPI_Comm c) { val2_distr_t dd; int i; mpi_buf_t* buf = alloc_mpi_buf(base_type, base_cnt); dd.low = bwork+ework; dd.high = bwork; for (i = 0; i<r; ++i) { par_do_mpi_work(df_cyclic2, &dd, c); mpi_commpattern_sendrecv(buf, DIR_UP, 0, 0, c); } free_mpi_buf(buf); }

  16. Currently Implemented Performance Property Functions • MPI Point-to-PoCommunication Performance Properties • late_sender(basework, extrawork, rf, MPI_Comm); • late_receiver(basework, extrawork, rf, MPI_Comm); • MPI Collective Communication Performance Properties • imbalance_at_mpi_barrier(distr_func, distr_param, rf, MPI_Comm); • imbalance_at_mpi_alltoall(distr_func, distr_param, rf, MPI_Comm); • late_broadcast(basework, rootextrawork, root, rf, MPI_Comm); • late_scatter(basework, rootextrawork, root, rf, MPI_Comm); • late_scatterv(basework, rootextrawork, root, rf, MPI_Comm); • early_reduce(rootwork, baseextrawork, root, rf, MPI_Comm); • early_gather(rootwork, baseextrawork, root, rf, MPI_Comm); • early_gatherv(rootwork, baseextrawork, root, rf, MPI_Comm); • OpenMP Performance Properties • imbalance_in_parallel_region(distr_func, distr_param, rf); • imbalance_at_barrier(distr_func, distr_param, rf); • imbalance_in_loop(distr_func, distr_param, rf);

  17. TEST PROGRAMS MPI PROPERTIES OpenMP PROPERTIES MPI UTILS OpenMP UTILS par_do_mpi_work() alloc_mpi_buf() free_mpi_buf() alloc_mpi_vbuf() free_mpi_vbuf() mpi_commpattern_sendrecv() mpi_commpattern_shift() par_do_omp_work() DISTRIBUTION df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() WORK do_work() Current Design of ATS Framework

  18. Performance Property Test Programs • Single performance property testing • Programs can be generated automatically fromperformance property function signature • Generator based on Program Database Toolkit (PDT) • http://www.cs.uoregon.edu/research/paracomp/pdtoolkit/ • Property parameters become test program arguments • More extensive tests through scripting languagesor experiment management system (e.g., Zenturio) • http://www.par.univie.ac.at/project/zenturio/ • Composite performance property testing • Program containing multiple performance property functions • Complexity only limited by imagination • Currently: manually implemented

  19. Example: Single Performance Property Test Program #include "mpi_pattern.h" int main(int argc, char *argv[]) { distr_func_t df = atodf("b2:0.5:1.0"); distr_t *dd = atodd("b2:0.5:1.0"); int r = 1; MPI_Init(&argc, &argv); switch ( argc ) { case 3: r = atoi(argv[2]); case 2: df = atodf(argv[1]); dd = atodd(argv[1]); case 1: break; default: fprintf(stderr, "usage: %s <distf><rfac>\n", argv[0]); break; } imbalance_at_mpi_barrier(df, dd, r, MPI_COMM_WORLD); MPI_Finalize(); }

  20. Example: Single Performance Property Test Program • imbalance_at_mpi_barrier <distribution-spec> <repition-factor> b2:0.5:1.0 2 b2:0.1 :2.0 5 • Problem: additional property “MPI Setup/Termination Overhead” also holds!

  21. Example: Collection of MPI Performance Properties

  22. Examples: Detail MPI Properties

  23. Example: MPI Properties in 2 Communicators

  24. EXPERT Analysis of MPI 2 Communicator Example

  25. Example: OpenMP Performance Property

  26. ATS: Status and Future Work • Initial prototype available from APART website • List of MPI, OpenMP, and hybridvalidation and benchmark suites • 1st version of ATS framework including • C version of code • Single property test program generator • Future Work • More complete collection of validation and benchmark suites • Real “real world” applications • ATS Framework • Fortran version • More complete list of property functions forMPI, OpenMP, hybrid, and sequential performance properties • Documentation

More Related