210 likes | 347 Views
Getting Rid of Store-Buffers in TSO Analysis. Mohamed Faouzi Atig Uppsala University, Sweden Ahmed Bouajjani LIAFA, University of Paris 7 , France Gennaro Parlato ✓ University of Southampton, UK. Sequential consistency memory model (SC). T 1. Shared Memory. …. T n.
E N D
Getting Rid of Store-Buffers in TSO Analysis Mohamed FaouziAtig Uppsala University, Sweden Ahmed Bouajjani LIAFA, University of Paris 7, France GennaroParlato✓ University of Southampton, UK
Sequential consistency memory model (SC) T1 Shared Memory … Tn • Write(var,val): sh_mem[var] val; (immidialy visible to all threads • Read(var): returns sh_mem[val]; • SC= • actions of different threads interleaved in any order • action of the same thread maintain the execution order • WMM= • For performance reason • modern multi-processors reorder • memory operations of the same thread
Total Store Ordering (TSO) M1 T1 (x4) (z7) (y3) Shared Memory … … • Each thread has its store-buffer (FIFO) • Write(var,val): the pair (varval)is sent to the buffer • Memory update = execution of a Write taken from some buffer • Read(var) returns val - If (varval) the last value written intovarstill in the store-buffer - the buffer does not contain any Write to var, and sh_mem(var) = val • fencerequires that the store-buffer is empty … Mn Tn (z4) (y4)
Correct under SC -- Wrong under TSODekker’s mutual exclusion protocol Shared memory Thread 1 a: y:=1 b: r1:=x c: if (r1==0) then d: critical section Thread 2 1: x:=1 2: r2:=y 4: if (r2==0) then 4: critical section Bad Schedule for TSO: a b c d 1 2 3 4 both threads in the critical section!!!
Verification for TSO? • For finite state programs reachability is non-primitive recursive [Atig, Bouajjani, Burckhardt, Masuvathi – POPL’10] • What shall we do? • Symbolic representation of the store buffers? [Linden, Wolper—SPIN’10]: Regular model-checking • Our approach reduce the analysis from TSO to SC • can be done only with approximations …
What is this talk about If we restrict to only executions where each thread is executed at most k times with no interruption (for a fixed k) we can translate any concurrent program PTSO (recursion, thread creation, heap, …) into another program PSC s.t. • PSC (under SC) simulates all possible executions of PTSO (under TSO) where each thread is executed at most k times • PSC has no buffer at all! Simulation of the store-buffers using 2k copies of the shared variables as locals • PSC has linear size in the size of PTSO • Advantage: use off-the-shelf SC tools for the analysis of TSO programs
Code-to-code translation from TSO to SC
k-round (for each thread) reachability P1 M1 T1 Run = (Ti1++Mi1)+ (Ti2++Mi2)+ ... round Pi1 round Pi2 A k-round run : Ɐi # round Pi ≤ k Shared Memory … … Pi Mi Ti … …
Compositional reasoning [(Ti +Mi)*]k round0 (Mask0 Buff0) round1 (Mask1 Buff1) round2 (Mask2 Buff2)
Getting rid of store-buffers Maski is a copy of the shared vars as Boolean (as locals) (Mask0 Buff0) (Mask1 Buff1) Buffi (Mask2 Buff2) is a copy of the shared vars (as locals)
Invariant: Buff0 Mask0 Buff1 Mask1 Mask2 Buff2 store-buffer round 0 round 1 (x0) (y1) (z4)(y7) (x0) (x4) (x7) (x3) (x7) (y5) round 2 • at each time in the simulation • Maski [var]=1 iff • there is a store in the store-buffer for varthat update the Shared memory at round i • Buffi[var]containts the last value sent for var
Simulation • Before simulation: • Masks set to False • r_SC0;r_TSO0; • Simulation: • All statements not involving shared vars are executed • Write(var,val) • Maskr_TSO[var] T; • Queuer_TSO[var] val; • Read(var) • Let ibe the greatest index s.t. • i>=r_SC&Maski(var) =1 • if i>=0 returnQueuei[var] • else return var; 0,0 1,2 0,1 1,3 0,2 (Mask0 Buff0) (Mask1 Buff1) round 0 round 1 round 2 (Mask2 Buff2) End of round : (Update shared vars): For all var if Maskr_SC(var) ==1 varBuffr_SC [var]; Buffi
Skeleton of the translation before(){ // start round if (!sim){ lock; sim=1; r_SC++; if (r_TSO< r_SC) r_TSO=r_SC; } while(*) r_TSO++; } Shared sh_vars; Thread_i() Begin locals l_vars; stmt_1; stmt_2; … stmt_n; end r_TSO, r_SC, sim, Mask0 , Buff0, …,Maskk, Buffk; Init(); // initialize Masks to False, r_SC=0, r_TSO, sim=0; stmt_j before(); stmt_j; after(); after() { if(*) //end round Update_shared(r_SC, Mask, Queue) sim=0; unlock; }
Characteristics of the translation • For fixed k, PSC is linear in the size of PTSO • 2k copies of the shared variable as locals (no store-buffer) • PSC and PTSO are in the same class • no restriction on the programs is imposed • The reachable shared states are the same in PSC and PTSO A state S is reachable in PTSO with at most k rounds per thread iff S is reachable in PSC
Bounding Store Ages • Observation: • When r_SC =1(Mask0, Buff0)are not used any longer • Reuse the Mask and Queue variables: • Translation:(Maskj , Buffj)are used circularly (modulo k+1). • k store-ages: • Unbounded rounds! • Constraint: each write pair remains in the store-buffer for at most k rounds (Mask0 Buff0) (Mask1 Buff1) (Mask2 Buff2) (Mask0 Buff0) … …
How can we use this code-to-code translation?
Corollaries Decidability results for TSO reachability Our code-to-code translation is a linear reduction TSO -> SC. Inherit decidability from SC
Tools for SC Tools for TSO(our code-to-code translation as a plug-in) A convenient way to get new tools for TSO … Concurrent Program SC tools: • Bounded model checking: • ESBMC (FSE’11) • Poirot(by MSR) • Storm (CAV’09) • … • Boolean Programs: • Boom, Boppo • GETAFIX (PLDI’09) • jMoped [SPIN’08] • … • CHESS (MSR) • Sequentialization + sequ. tools TSOSC tranlsation Instrumentation for the SC tool SC tool
Experiments POIROT: SMT-based bounded model-checkers for SC programs Errors due to TSO discovered in few seconds! POIROTcan also be a model-checker for TSO!
Conclusions We have proposed a code-to-code translation from TSO to SC • allows to use existing and future tools designed for SC to analyze programs running under TSO • under-approximation (error finding) • restrictions imposed on the analyzed runs is useful to find errors in programs Beyond TSO ? Generic approach ? Thanks!