220 likes | 346 Views
Scheduling Considerations for building Dynamic Verification Tools for MPI. Sarvani Vakkalanka , Michael DeLisi Ganesh Gopalakrishnan , Robert M. Kirby School of Computing, University of Utah, Salt Lake City Supported by Microsoft HPC Institutes, NSF CNS-0509379
E N D
Scheduling Considerations for building Dynamic Verification Tools for MPI SarvaniVakkalanka, Michael DeLisi Ganesh Gopalakrishnan, Robert M. Kirby School of Computing, University of Utah, Salt Lake City Supported by Microsoft HPC Institutes, NSF CNS-0509379 http://www.cs.utah.edu/formal_verification
Background The scientific community is increasingly employing expensive supercomputers that employ distributed programming libraries…. (BlueGene/L - Image courtesy of IBM / LLNL) (Image courtesy of Steve Parker, CSAFE, Utah) …to program large-scale simulations in all walks of science, engineering, math, economics, etc.
Current Programming Realities Code written using mature libraries (MPI, OpenMP, PThreads, …) API calls made from real programming languages (C, Fortran, C++) How best to verify codes that will run on actual platforms? Runtime semantics determined by realistic Compilers and Runtimes
Classical Model Checking Finite State Model of Concurrent Program Extraction of Finite State Models for realistic programs is difficult. Check Properties
Dynamic Verification • Avoid model extraction which can • be tedious and imprecise • Program serves as its own model • Reduce Complexity through • Reduction of interleavings (and • other methods) Actual Concurrent Program Check Properties
Dynamic Verification • Need test harness in order to run the • code. • Will explore ONLY RELEVANT • INTERLEAVINGS (all Mazurkeiwicz traces) • for the given test harness • Conventional testing tools • cannot do this !! • E.g. 5 threads, 5 instructions • each 1010 interleavings !! Actual Concurrent Program Check Properties One Specific Test Harness
Dynamic Verification Actual Concurrent Program • Need to consider all test • harnesses • FOR MANY PROGRAMS, • this number seems small • (e.g. HypergraphPartitioner) Check Properties One Specific Test Harness
Related Work • Dynamic Verification tool: • CHESS • Verisoft (POPL ’97) • DPOR (POPL’ 05) • JPF • ISP is similar to CHESS and DPOR
Dynamic Partial Order Reduction (DPOR) P0 P1 P2 L0 L0 U0 U0 lock(x) lock(x) lock(x) L1 L2 ………….. ………….. ………….. U1 U2 unlock(x) unlock(x) unlock(x) L1 L2 U1 U2
ISP Executable Proc1 Proc2 …… Procn Scheduler Run MPI Program • Manifest only/all relevant • interleavings (DPOR) Profiler MPI Runtime • Manifest ALL relevant • interleavings of the MPI • Progress Engine : • - Done by DYNAMIC • REWRITING of WILDCARD • Receives.
Using PMPI P0’s Call Stack Scheduler User_Function TCP socket MPI_Send P0: MPI_Send MPI_Send SendEnvelope PMPI_Send PMPI_Send In MPI Runtime MPI Runtime
DPOR and MPI • Implemented an Implicit deadlock detection technique form a single program trace. • Issues with MPI progress engine for wildcard receives could not be resolved. • More details can be found in our CAV’2008 paper: “Dynamic Verification of MPI Programs with Reductions in Presence of Split Operations and Relaxed Orderings”
POE Scheduler P0 P1 P2 Isend(1) sendNext Barrier Isend(1, req) Barrier Irecv(*, req) Barrier Isend(1, req) Barrier Wait(req) Wait(req) Recv(2) Wait(req) MPI Runtime
POE Scheduler P0 P1 P2 Isend(1) Barrier sendNext Isend(1, req) Irecv(*, req) Irecv(*) Barrier Barrier Barrier Barrier Isend(1, req) Wait(req) Recv(2) Wait(req) Wait(req) MPI Runtime
POE Scheduler P0 P1 P2 Isend(1) Barrier Barrier Isend(1, req) Irecv(*, req) Barrier Barrier Irecv(*) Barrier Barrier Barrier Isend(1, req) Barrier Wait(req) Recv(2) Wait(req) Wait(req) Barrier MPI Runtime
POE Scheduler P0 P1 P2 Isend(1) Irecv(2) Barrier Isend Wait (req) Isend(1, req) Irecv(*, req) Barrier No Match-Set Irecv(*) Barrier Barrier Isend(1, req) Barrier Recv(2) SendNext Wait(req) Recv(2) Wait(req) Wait(req) Barrier Deadlock! Isend(1) Wait Wait (req) MPI Runtime
MPI_Waitany + POE Scheduler P0 P1 P2 Isend(1, req[0]) sendNext sendNext Isend(2, req[0]) Isend(1, req[0]) Recv(0) Barrier Waitany(2, req) Barrier Recv(0) Isend(2, req[1]) Recv(0) Waitany(2,req) Barrier Barrier MPI Runtime
MPI_Waitany + POE Scheduler P0 P1 P2 Isend(1,req[0]) Isend(1, req[0]) Isend(2, req[0]) Isend(1, req[0]) Recv(0) Barrier Waitany(2, req) Recv Barrier Recv(0) Isend(2, req[1]) Recv(0) Barrier Waitany(2,req) Valid req[0] Barrier Error! req[1] invalid Barrier Invalid MPI_REQ_NULL req[1] MPI Runtime
MPI Progress Engine Issues Scheduler P0 P1 Irecv(1, req) sendNext Scheduler Hangs Barrier Irecv(1, req) Isend(0, req) Barrier Wait(req) sendNext Wait(req) Barrier Isend(0, req) Does not Return Wait PMPI_Wait PMPI_Irecv + PMPI_Wait MPI Runtime
Experiments • ISP was run on 69 examples of the Umpire test suite. • Detected deadlocks in these examples where tools like Marmot cannot detect these deadlocks. • Produced far smaller number of interleavings compared to those without reduction. • ISP run on Game of Life ~ 500 lines code. • ISP run on Parmetis ~ 14k lines of code. • Widely used for parallel partitioning of large hypergraphs • ISP run on MADRE • (Memory aware data redistribution engine by Siegel and Siegel, EuroPVM/MPI 08) • Found previously KNOWN deadlock, but AUTOMATICALLY within one second ! • Results available at: http://www.cs.utah.edu/formal_verification/ISP_Tests
Concluding Remarks • Tool available (download and try) • Future work • Distributed ISP scheduler • Handle MPI + Threads • Do large-scale bug hunt now that ISP can execute large-scale codes.
Implicit Deadlock Detection Scheduler P0 P1 P2 P0 : Irecv(*) P1 : Isend(P0) Irecv(*, req) Isend(0, req) Isend(0, req) No Matching Send P2 : Isend(P0) P0 : Recv(P2) Recv(2) Wait(req) Wait(req) P1 : Wait(req) P2 : Wait(req) Deadlock! Wait(req) P0 : Wait(req) MPI Runtime