320 likes | 423 Views
Symbolic Synthesis of Masking Fault-Tolerant Distributed Programs. Borzoo Bonakdarpour Workshop APRETAF January 23, 2009. Joint work with Sandeep Kulkarni. Motivation. The most important goal of formal methods is achieving correctness in computing systems (programs).
E N D
Symbolic Synthesis ofMasking Fault-Tolerant Distributed Programs BorzooBonakdarpour Workshop APRETAF January 23, 2009 Joint work with Sandeep Kulkarni
Motivation • The most important goal of formal methods is achieving correctness in computing systems (programs). • Correct-by-verification • A program is built manually • The correctness of the program is verified by a model checker or a theorem prover. • Correct-by-construction • A program is constructed so that its correctness is guaranteed. Verification Manual Design
Motivation • Automated synthesis from temporal logic specifications • Pros: • Ability to start from a null program • Capability to handle highly expressive specifications • Cons: • Highly complex decision procedures • Limited to no reusability • Automated program revision • An existing program is revised with respect to a property
Motivation • Question: • Is it possible to revise the program automatically such that it satisfies the failed property while ensuring satisfaction of existing properties? • bugs • incomplete specification • change of environment Model Checker • Program Counterexample • Property
Motivation Revision Algorithm • Program Revised program • Property
{y1} {z1} {x2} Motivation A one-lane bridge is controlled by two traffic signals at the two ends of the bridge. Controller Program: (1) (2) SPECbt = {(0, 1) | sig1 (1) ≠ Rsig2 (1) ≠ R} (sig1 = G) (1 ≤ x1 ≤ 10) sig1 := Y; [] (sig1 = Y) (1 ≤ y1 ≤ 2) sig1 := R ; [] (sig2 = R) (z1 ≤ 1) sig2 := G ; [] ((sig1 = G) (x1≤ 10)) ((sig1 = Y ) (y1≤2)) ((sig1 = R) (z2≤1)) wait;
{z2} trueskip; Motivation Traffic Controller Fault Action: (1) (2) • (sig1 = sig2 = R) (z1≤ 1) (z2 > 1) • (sig1 = sig2 = R) (z1 ≤ 1) (z2 = 0) • (sig1 = G) (sig2 = R) (z1 ≤ 1) (z2 = 0) • (sig1 = G) (sig2 =G) (z1 ≤ 1) (z2 = 0)
Mohamed Gouda: When does your “12 years” end?! 1994 2000 1999 2007 1989 1982 1981 1993 1992 2005 1986 Motivation Kulkarni and Arora introduce automated addition of fault-tolerance to fault-intolerant programs Intel reports bug in floating point operations in Pentium processors Clarke and Grumberg introduce counterexample guided abstraction-refinement (CEGAR), 101000 reachable states Wonham and Ramadge introduce controller synthesis Clarke, Emerson, Sifakis, and Queille invent model checking Clarke, Emerson, Sifakis, and Queille invent model checking Bonakdarpour and Kulkarni synthesize distributed programs of size 1050 Emerson and Clarke propose synthesis from CTL properties 2008 Biere and Clarke invent SAT-based model checking (10500 reachable states) Bonakdarpour, Kulkarni, and Ebnenasir, and, Jobstmann and Bloem independently introduce program revision (repair)techniques Vardi and Wolper introduce automata-theoretic verfication andsynthesis McMilan et al. intorduce BDD-based model checking(1020 reachable states)and find bugs in IEEE futurebus+ McMilan et al. intorduce BDD-based model checking(1020 reachable states)and find bugs in IEEE futurebus+ Alur and Henzinger propose verification and synthesis of real-timesystems
The Synthesis Problem State space p f f Invariant p f p p f p p p p p f p Fault-Span
Modeling distributed programs: A program consists of a set of processes. Each process p is specified by: A set Vp of variables, A set Tp of transitions, A set Rp Vpof variables that p is allowed to read, A set Wp Rpof variable that p is allowed to write. Write restrictions a Wp a Wp a = 1 b = 1 a = 1 b = 1 a = 0 b = 1 a = 0 b = 1 The Issue of Distribution • Such transitions cannot be executed by process p.
a = 0 b = 1 a = 0 b = 1 a = 1 b = 1 a = 1 b = 1 b Rp b Rp a = 0 b = 0 a = 0 b = 0 a = 1 b = 0 a = 1 b = 0 The Issue of Distribution • Read restrictions • Such set of transitions form a group. • Addition and removal of any transition must occur along with its entire group.
What Is DifficultAbout Program Revision? • Space complexity • The state explosion problem • Time complexity • NP-completeness • Identifying the complexity hierarchy of the problem • The need for designing efficient heuristics • Proofs are often helpful in identifying bottlenecks of the problem The combination of the above complexities is the worst nightmare!
What Is DifficultAbout Program Revision? Daniel Mosé: As that wise man said “bridging the gap between theory and practice is easier in theory than in practice!”
Decision Final? d.j d.k {0, 1, } d.l f.j f.k {false, true} f.l The Byzantine Agreement Problem GENERAL Decision d.g {0, 1} NON-GENERALS (d.j = ) ( f.j = false) d.j := d.g (d.j) ( f.j = false) f.j := true Program:
Byzantine? b.j b.k {false, true} b.l The Byzantine Agreement Problem Byzantine? b.g {false, true} (b.j ,b.k , b.l , b.g = false) b.j := true (b.j := true) d.j := 0|1 Faults:
What Is DifficultAbout Program Revision? • Experimental results with enumerative (explicit) state space (the tool FTSyn) • Byzantine agreement - 3 processes • 6912 states • Time: 10s • Byzantine agreement - 4 processes • 82944 states • Time: 15m • Byzantine agreement - 5 processes • 995328 states • Out of memory!
Polynomial -Time Heuristics Identify the state predicate ms from where faults alone violate the safety; S := Sms f f SPEC f f ms
Polynomial -Time Heuristics p Identify the state predicate ms from where faults alone violate the safety; S := Sms f p Re-compute the fault-span Inv. BDD frontier = Invariant; BDD current = mgr -> bddZero(); BDD FaultSpan = Invariant; while (FaultSpan != current) { current = FaultSpan; BDD image = frontier * (P + F); // -FaulSpan frontier = Unprime(image); FaultSpan = current + frontier; } f f f p f ms Fault-Span
Invariant Fault-Span Polynomial -Time Heuristics p s1 Identify the state predicate ms from where faults alone violate the safety; S := Sms Re-compute the fault-span s0 p p f f Re-computing state predicates or transitions predicates do not occur often in model checking, but it does happen quite often during synthesis. Identify transitions in the fault-intolerant program that may be included in the fault-tolerant program No Fixpoint? Yes Resolve deadlock states
Experimental Results • Polynomial-time sound BDD-based heuristics • The tool SYCRAFT(http://www.cse.msu.edu/~borzoo/sycraft) • C++ • CuDD (Colorado University Decision Diagram Package) • Platform • Dedicated PC • 2.2GHz AMD Opteron processor • 1.2GB RAM
Experimental Results • Goal: • Identifying various bottlenecks of our synthesis heuristics • Fault-span generation • Deadlock resolution • Adding recovery • State elimination • Cycle detection and resolution • Memory usage • Total synthesis time
Experimental Results Performance of synthesizing the Byzantine agreement program
Experimental Results • Observations • 1050 reachable states • State elimination (deadlock resolution) is the most serious bottleneck • We run of time before we run out of space • Size of state space by itself is not a bottleneck
Experimental Results ----------------------------------------------------------------------------------------------------- UNCHANGED ACTIONS: ----------------------------------------------------------------------------------------------------- 1- (d.j==2) & !(f.j==1) & !(b.j==1) (d.j:=dg) ----------------------------------------------------------------------------------------------------- REVISED ACTIONS: ----------------------------------------------------------------------------------------------------- 2- (b.j==0) & (d.j==1) & (d.k==1) & (f.j==0) (f.j:=1) 3- (b.j==0) & (d.j==0) & (d.l==0) & (f.j==0) (f.j:=1) 4- (b.j==0) & (d.j==0) & (d.k==0) & (f.j==0) (f.j:=1) 5- (b.j==0) & (d.j==1) & (d.l==1) & (f.j==0) (f.j:=1) ----------------------------------------------------------------------------------------------------- NEW RECOVERY ACTIONS: ----------------------------------------------------------------------------------------------------- 6- (b.j==0) & (d.j==0) & (d.l==1) & (d.k==1) & (f.j==0) (d.j:=1) 7- (b.j==0) & (d.j==1) & (d.l==0) & (d.k==0) & (f.j==0) (d.j:=0) 8- (b.j==0) & (d.j==0) & (d.l==1) & (d.k==1) & (f.j==0) (d.j:=1), (f.j:=1) 9- (b.j==0) & (d.j==1) & (d.l==0) & (d.k==0) & (f.j==0) (d.j:=0), (f.j:=1) ------------------------------------------------------------------------------------------
Experimental Results The effect of exploiting human knowledge (Each non-general process is allowed to finalize its decision if no two non-generals are undecided.)
Experimental Results Performance of synthesizing token ring mutual exclusion with multi-step recovery
Experimental Results Multi-step vs. single-step recovery for synthesizing token ring mutual exclusion
Doctoral Dissertation Defense Open Problems • Exploiting techniques from model checking • State space generation (e.g., clustering and partitioning) • Symmetry reduction • Counter-example guided abstraction-refinement (CEGAR) • SMT/QBF-based methods • Distributed/parallel techniques
Doctoral Dissertation Defense Open Problems • Multidisciplinary research problems • Revising hybrid systems • Synthesizing programs with multiple concerns (e.g., security, communication, real-time, fault-tolerance, distribution) in epistemic logic • Program synthesis using graph mining and machine learning techniques • Biologically-inspired revision/synthesis techniques