280 likes | 443 Views
TDB. TD B: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS. Authors:. RCMS PSI RAS , Pereslavl-Zalessky , Russia. A. Adamovich M. Kovalenko. History of the Development. T-system RCMS PSI RAS , since the early 90 s
E N D
TDB TDB: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS
Authors: RCMS PSI RAS, Pereslavl-Zalessky, Russia • A.Adamovich • M.Kovalenko
History of the Development • T-system RCMS PSI RAS, since the early 90s • The SKIF project of the Russia-Belarus Union State 2000-2004 T-system and itsenvironment: • T-system (industrial version); • the TGCC compiler; • the TDB interactive debugging system; • and others.
Objectives of the Development • Support of software design and development using computing systems of the SKIF family • the element of the integrated toolkit; • directed towards T-system support. • Cost-effectiveness • reduced expenses for purchasing and maintaining the SKIF computing system • Information independence
Predecessors and Analogues • P2D2 (Portable Debugger for Parallel and Distributed Programs, NASA, 1994, Doreen Cheng, Robert Hood) • TotalView (Etnus) • DDT (Distributed Debugging Tool, Streamline Computing)
Basic Architecture Principles The TDB architecture: • distributed and multi-component • open and portable • flexible • multi-user
The TDB Architecture:Distributed and Multi-component 1) The primary daemon 2) The secondary daemon 3) The central server 4) The client component 5) The debugging server
The TDB Architecture (2/2) Flexible • uses free software: • АСЕ, libxml++, libpcre, libgtk2.x, scintilla, gnome-debug-tdb (based on gnome-debug) • the possibility of using commercial products, system debuggers, for example
TBD Features • Debug C and C++, Fortran programs • Linux for 32-bit or 64-bit processors • Debug parallel MPI programs. • Supported MPI implementations: LAM, MPICH, SCAMPI, MP-MPICH, DMPI. • Advanced job launch methods • Monitoring of states of target nodes • Multi-user support
TBD Features • One-touch breakpoint setting/manipulating • Step into, over or out of functions • Watchpoints • One-touch symbolic display • Controls processes individually or collectively • Color-coded processes/nodes states • Log files
TBD Features • Groups • Group processes using flexible definition language • Two types of groups supported: • static groups and • dynamic groups • Control grouped processes as lone processes (step, next, stop...) with real-time visual feedback • Special group commands: • group breakpoint, • group display
TBD Features • Two process control modes: • active process control mode • group control mode • Two GTDB operational modes: • active process / active group debugging mode • per process debugging mode
TBD Features • Special support for parallelizing systems: • T-system support: • Special commands t-break, t-print…
GTDB (TDB GUI client) windows and components features • Main window: • Active Process window • Source Code display with breakpoints • Command buttons • Command component • Active process / Active group selection component
GTDB windows and components features • GUI component for per process debugging: • With GUI features for easy processes and MPI-nodes status read • With ability to pick and choose one of processes • Full featured subcomponent for processes debugging similar to main subcomponent for debugging active process • MPI-nodes/processes states window, also used for selecting processes to inspect
GTDB windows and components features • Breakpoints manipulation component window • Configuration / Properties component window • Various pop-up menus used for: • selected expression data inspection and manipulation, print, display, watchpoints, value set... • execution control (breakpoints set, disable, delete...)
GTDB – TDB Client Component • intuitive interface and ergonomic design • the presentation of information is handy and convenient
GTDB Node Selection Component User can select the exact set of computational nodes that are available for debugging MPI tasks. The list of all nodes available for MPI task debugging can be obtained through the request to TDB daemons. The primary TDB daemon is running on front-end and Secondary TDB daemons are running on computational nodes of cluster. TDB daemons represent monitor processes. Secondary daemons collect and the primary daemon accumulates useful info about computational nodes status.
GTDB Properties Component Is used to configure various TDB, GTDB, and MPI implementations settings
GTDB Nodes Status Component • Describes statuses of MPI-nodes processes. • Green color marks running processes • Yellow color marks stopped processes • Red color marks processes that have been stopped or terminated by a signal Upper bar : common MPI-node status Green - all processes of the node are running Yellow – at least one of the processes is stopped Red - at least one process caught a signal Common status bar is used in purpose to give the user the opportunity to read information about the situation with debugging processes in a more simple and clear way.All status subcomponents are implemented as button widgets: if clicked, open appropriate process (processes) for individual exploration in the PROCS GTDB mode.
GTDB Breakpoints Component The component is used to work with various types of breakpoints supported in TDB: • Source line breakpoints, • function breakpoints and • watchpoints; all of them may have conditions. As well a special type of breakpoints is implemented in TDB, so called “group breakpoints”. The group breakpoint allows user to set a number of uniform breakpoints in a group of parallel processes. The user can set, delete, disable or enable group breakpoint in one command or click.
The Main GTDB Window. Sample Debug Session GTDB in the MAIN -> PROC mode. Process 2:0 is an active (selected, exploring) process...
Example Debug Session of Debugging Simple MPI Program Example of dynamic groups definition using the "dgroup" command
Example Debug Session of Debugging Simple MPI Program We continue the execution of processes from the masters dynamic group and then stop on previously set breakpoints in the loop.
Example Debug Session of Debugging Simple MPI Program As we can see the ‘i’ variable equals to zero on all processes in the masters group (the "print" command on group masters was used). To get out from the loop we set the ‘i’ variable on all masters to 1.
We continue execution of masters group processes, but – after the loop – execution is stopped by the SIGSEGV signal.
Per Procs GTDB Debugging Mode In the Main mode the user can work with one selected (active) process or group In the Procs mode he/she can examine any process individually. The component was implemented as two “notebooks” inserted one into the other. The first (outer, placed vertically) notebook is the MPI-nodes notebook. Its bookmarks contain info about appropriate processes and common MPI-node statuses, colored as nodes status component. The second (inner, placed horizontally) notebook is a notebook of processes...
Contacts • Max Kovalenkomadmax@botik.ru • Alexei Adamovichlexa@botik.ru • Sergei Abramovabram@botik.ru