180 likes | 383 Views
Judit Gimenez, German Llort, Harald Servat judit@cepba.upc.edu. CEPBA-Tools experiences with MRNet and Dyninst. Outline. CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list. Where we live. Traceland … … aiming at detailed analysis
E N D
Judit Gimenez, German Llort, Harald Servat judit@cepba.upc.edu CEPBA-Tools experiences with MRNet and Dyninst
Outline CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list
Where we live Traceland … … aiming at detailed analysis and flexibility in the tools
Importance of details Variance is important Along time Across processors Highly non linear systems Microscopic effects are important May have large macroscopic impact
CEPBA-Tools TraceDriver MPtrace OMPItrace MPIDtrace Dimemas .trf trace2trace .cfg Nanos Compiler .prv aixtrace2prv AIXtrace .pcf Paraver LTT2prv LTTtrace Java, WAS GT4 Data display tools Paramedir JIS Aaa miss ratio 0.8 Bbb IPC 0.5 Ccc Efficiency 0.4 Ddd bandwidth 520 GPFS2prv GPFStrace
CEPBA-Tools Challenge What can we say about an unknown application/system without looking at the source code in short time?
OpenMP instrumentation OMPtrace Instrumentation of OpenMP Insight on: application Run Time scheduling Based on DiTools (SGI/Irix) only calls to dynamic libraries DPCL (IBM/AIX) functions and calls referenced within binary Dyninst (Itamium) functions and calls referenced within binary LD_PRELOAD (some Linux) only calls to dynamic libraries “Evolution” through the available platform except for Itanium (NASA-AMES request)
OpenMP compilation and Run Time A() { A() { kmpc_fork_call !$omp parallel do do I=1,N loop body enddo Call A Call A } Idle() { } Compiler generated libomp Source program Compiler generated _A_LN_par_regionID { do I=start,end loop body enddo }
OpenMP instrumentation points Main thread A() { 1 3 2 kmpc_fork_call Call A 5 4 6 } 2 4 1 3 6 5 OMP_PAR,1 USR_FCT, idA HWCi, Delta PAR_FCT, A_LN_par_regionID HWCi, Delta PAR_FCT, 0 HWCi, Delta USR_FCT, 0 HWCi, Delta OMP_PAR,0 7 (Fork/join) 7 (Fork/join) 1 1 1 1 Timeline _A_LN_par_regionID { do I=start,end loop body enddo }
Instrumentation @ CEPBA-Tools The issue Sufficient information / sufficiently detailed Usable by presentation tool The environment evolution (1991-2007) from few processes to 10.000 instrumenting hours of execution including more and more information hardware counters, call stack, network counters, system resource usage, MPI collective internals... ...from traces of few MB to hundreds of GB
Scalability of tracing Techniques for achieving scalability User specified on/off Limit file size (stop when reached, circular buffer) Only computing burst + counters + statistics Library Summarization (software counters – MPI_Iprobe/ MPI_Test) Trace2trace utilities Partial views ... autonomic tracing library
MPItrace + MRNet user loginnode
First target with MRNet A real problem scenario on MareNostrum some large runs punctually have very large degraded collectives instrumenting full run including details of collectives implementation would produce a huge trace Solution MPItrace + MRNet control which information is flushed to disk discard all the details except the related with large collectives
Implementation i … i+n 10 … 300 … • Instrumenting on a circular buffer • Periodically • the MRNet front-end requests information on the collectives duration • the “spy” thread • stops the main thread • analyze the tracing buffer • collects information on the collectives • sends details on the range and duration • the root sends back a mask of selection • the “spy” thread • flushes to disk the selected data • resumes the application i … i+m 0 … 1
First traces – CPMD 245MB, >15500 col <1MB, <85 col LIMIT >= 35ms 25MB, <85 col
Next steps for MPItrace+MRnet Analysis of MRNet Evaluate impact topology / mapping Library control - maximum information, minimum data Automatic switching driven by on-line analysis Tracing level, type of data (counters set, instr. points), on/off Clustering, periodicity detection
Our wish list Dyninst Support to MPI+OpenMP instrumentation Available for PowerPC MRNet Automatically compute the best topology based on available resources maybe considering user preferences about mapping, dispersion degree (fan-out)... Improve MRNet integration with MPI applications