130 likes | 287 Views
MPI 3 Tools Working Group. April 2009 MPI Forum Meeting. Tools Working Group. Teleconferences every 2 weeks Most activity on the teleconferences Limited activity on the lists Proposals and some activity on the wikis (Martin takes excellent teleconf notes)
E N D
MPI 3 Tools Working Group April 2009 MPI Forum Meeting
Tools Working Group • Teleconferences every 2 weeks • Most activity on the teleconferences • Limited activity on the lists • Proposals and some activity on the wikis (Martin takes excellent teleconf notes) • https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/MPI3Tools
Working projects • Debugger interfaces • Standardization of debugger attaching to MPI job • “MPIR” interface • MPI-2 dynamic process acquisition (future) • State profiling / introspection interface • More debugger interfaces (covered today) • MPI message queue debugging • MPI handle debugging proposal
Debugger Attach • “MPIR” interface • Published in a Euro PVM/MPI paper • Many MPI’s and debuggers implement it • Can attach at job launch or after job is running • Discussion of standardizing “something better” • MPI-2 dynamic job acquisition • Published in a Euro PVM/MPI paper • Implemented in TotalView, but nowhere else
State profiling • Proposal from Sun • Have not really talked about this yet • Statistical sampling of MPI process state • Many more details on the Tools wiki (I won’t do it justice here)
MPI Process Debugging • Abstract architecture shown • DLL’s provided by MPI implementation “plug in” to debugger • Debugger calls functions in DLL • Probes the memory of an MPI process • Returns information to the debugger Debugger MPI DLL MPI Process MPI Process MPI Process MPI Process
3 Proposals • How to find MPI DLLs • New proposal for MPI handle debugging • Standardizing existing MPI message queue debugger interface • Slides from Chris Gottbrath, TotalView
How the debugger finds DLLs • Current method • “MPI_dll_name” (string) symbol examined in MPI process by debugger sometime during startup • Specifies the filename of the DLL to load • Shortcomings • Only allows the MPI to specify one DLL filename • No specification set for when the debugger will examine this symbol’s value • Therefore, DLL filename specified at compile time
How the debugger finds DLLs • New proposal • “mpimsgq_dll_locations” symbol name (for msg. q. DLL) • If the symbol is not present, the MPI does not support it • If the symbol is NULL, try again later • If the symbol is non-NULL, it is an argv-style string array (i.e., last entry in the array is a NULL pointer) • Debugger traverses all strings in the array (in order) and tries to dlopen (etc.) each of them • Not an error if the dlopen fails • Stops at the first DLL that successfully opens and is a match for the debugger’s version, etc. • Fall back to old method if nothing works: “MPI_dll_name” symbol (backward compatibility)
DLL location proposal • Advantages • Allows MPI to specify multiple DLLs (MPI does not know the bitness of the debugger when MPI applications are compiled) • MPI’s can ship multiple DLLs for different architectures • MPI can pick eligible plugins at run-time • No naming restrictions on filenames mpimsgq_dll_locations[0] = "/opt/ct7.1/lib/openmpi/libompitv.so"; mpimsgq_dll_locations[1] = "/opt/ct7.1/lib64/openmpi/libompitv.so"; mpimsgq_dll_locations[2] = NULL;
MPI Handle Debugging • Simple premise • Debuggers currently show the value of an MPI handle • Pointer or integer value • What if the debugger could show more than that? • Example information for a communicator • Type (inter, intra, cart, graph), intrinsic or not • Type characteristics: edges, nodes, remote procs, etc. • Rank, size, name • Attributes • Pending MPI_Requests • Derived MPI_Files, MPI_Wins • …?
MPI Handle Debugging • DLL presents a type map during startup • Identifies each MPI handle type to the debugger • Fortran problematic: INTEGER • When debugger finds an MPI handle value • Downcalls into the DLL to ask for information • C++ and Fortran handles are translated to C handles (DLL equivalent of MPI_*_f2c calls) • Debugger can present this information in its interface • Nice GUI panel (or whatever) • Can use all the information or not
MPI Handle Debugging • DLLs can provide some or all of the information • E.g., may not have list of pending requests on a communicator • “Safe” ways for the DLL to say “not supported” • Prototype proposal • Awaiting community / Forum feedback • Defined communicators, errhandlers, statuses, requests • Currently one DLL query function for each type • Returns “One Big Struct” with all the information • Will change OBS to many query functions, each returning intrinsic data types