310 likes | 447 Views
Distributed Verification of Multi-threaded C++ Programs. Stefan Edelkamp joint work with Damian Sulewski and Shahid Jabbar. Motivation: IO-HSF-SPIN. Same states in both parts. Arrives at the final state. Large jumps due to 2nd heuristic. Current state. Already seen final state.
E N D
Distributed Verification of Multi-threaded C++ Programs Stefan Edelkamp joint work with Damian Sulewski and Shahid Jabbar
Motivation: IO-HSF-SPIN Same states in both parts Arrives at the final state Large jumps due to 2nd heuristic Current state Already seen final state Arrives again at same final state 2.9 TB 20 days 1 node ---- 8 days 3 nodes
Overview • Software Checking in StEAM • Externalization • Virtual Addresses • Parallelization
Overview • Software Checking in StEAM • Externalization • Virtual Addresses • Parallelization
Software Checking • Advantages + Building a model unnecessary + Learning specification language unnecessary + Checking can be done more often • Disadvantages - Code has to be executed - Huge number of states - Huge states
StEAM • Can check concurrent C++ programs • Uses a virtual machine for execution • supports BFS, DFS, Best-First, A*, IDA* • finds • Deadlocks • Assertion Violations • Segmentation Faults
StEAM - Checking a C++ Program Model checker igcc Compiler Virtual Machine char globalChar; int globalBlocksize = 7; int main(){ allocateBlock(blocksize); } void allocateBlock(int size){ void *memBlock; memBlock = (void *) malloc(size); } Objectcode
StEAM - Interpreting the Object Code ICVM Virtual Machine Register char globalChar; int globalBlocksize = 7; int main(){ allocateBlock(blocksize); } void allocateBlock(int size){ void *memBlock; memBlock = (void *) malloc(size); } Objectcode Text Section BSS Section Data Section Stack Memory Pool
StEAM State 1 Initial State State 2 Register Register Register TextSection BSSSection BSSSection BSSSection DataSection DataSection Stack Stack Stack MemoryPool MemoryPool StEAM – Generating States ICVM Virtual Machine Register Text Section BSS Section Data Section Stack Memory Pool
Overview • Software Checking in StEAM • Externalization • Virtual addresses • Parallelization
Externalization - Motivation time Internal External problem size
Disk RAM Externalization – Mini States [EJMRS 06] • pointer to a state in RAM or on Disk • pointer to the predecessor mini state • constant size
Mini States Secondary Memory Internal Memory Externalization – Expanding a State Cache
Secondary Memory Internal Memory Externalization – Flushing the Cache Cache Mini States
Externalization – Collapse Compression State Caches Files on Disk Register Text Section BSS Section Data Section Stack Memory Pool
Overview • Software Checking in StEAM • Externalization • Virtual Addresses • Parallelization
Virtual Addresses • programs request memory • memory assignment done by system • moving program between nodes impossible • two possible strategies • converting the addresses before executing • using virtual addresses
Data Stack Text BSS Stack pointer 0 Stack pointer Program counter virtual address: y AVL-Tree y x, size RAM real address: x Virtual Addresses – Memory Management Memory pool
Virtual Addresses - Overhead time virtual real nodes
Overview • Software Checking in StEAM • Externalization • Virtual Addresses • Parallelization
Parallelization – Motivation Distributed (Shared) Memory MPI channels/shared RAM communication Sending full states too expensive (if not used for expansion) Exploit externalization DualChannel (Speedup vs. Load Balance) Appropriate State Space Partitioning
Parallelization – Hash Partitioning Partitioning by hashing full state Problem: Successors often not in same partition high communication overhead Partitioning by hashing partial state, e.g. memory pool Problem: Too many states map to one hash value Load balancing
Parallelization – Incremental Tree Hashing [EM05] h(s) = (Σi si 3^i) mod 17 h(1,2,3,1,2,2,1,2) = 4+1*3^2 + 9*3^(2+2) mod 17 = 11 h(3,1) = 3*3+1*9 mod 17= 1 h(2,2,1,2) = 9 = 6+h(2,1,2)*3^1 = 6+1*3 mod 17 h(1,2) = 1*3+2*9 mod 17 = 4 h(2) = 2*3^1 mod 17= 6
Parallelization – Search Partitioning horizontal slices vertical slices DFS [Holzman & Bosnacki 2006] Best-First, A*
Parallelization - Hardware • Cluster Vision System (PBS) • Linux Suse 10.0 • MPI via infiniband • Files via GBit Ethernet • 224 nodes (464 procs), < 15 used • AMD Opteron DP 50 (2.4 GHz)
Experiments: 15-Puzzle Partial Hash speedup time nodes
Experiments – Depth-First Slicing 200 Philosophers time Top Result: 600 Phils / 6 nodes 97 KB /state Ex Collapse Compression & Distribution 16GB 1.5 GB per node processors
Experiments - Bath-Tub Effect (50 phils-avg.) Time validates Holzmann & Bosnacki Size of Depth Layer
Experiment - Shared Memory Bakery (pthread) • 4 Opteron MP 852 (2.6 GHZ) speedup time nodes
Conclusion Preceeding Work: Full Externalization of States, inIO-HSF-SPIN Constant-Size RAM, e.g. 1.8 GB RAM, 20 days 1 proc, 8 days 4 procs, 2.9TB disk [EJ06], Distribution via (g+h)-Value Problem: Huge & Highly Dynamic States Solution:Mini States as Constant Size Finger Prints of States in RAM for Dual-Channel Communication to combine External and Parallel Search with Memory-Pool, Best-First Slicing Partitioning