370 likes | 385 Views
Reducing Verification Complexity of a Multicore Coherence Protocol Using Assume/Guarantee. Xiaofang Chen 1 Yu Yang 1 Ganesh Gopalakrishnan 1 Ching-Tsun Chou 2. 1 University of Utah 2 Intel Corporation. * Supported by Intel SRC Customization Award 2005-TJ-1318 and NSF CNS-0509379.
E N D
Reducing Verification Complexity of a Multicore Coherence Protocol Using Assume/Guarantee Xiaofang Chen1 Yu Yang1 Ganesh Gopalakrishnan1 Ching-Tsun Chou2 1University of Utah 2Intel Corporation * Supported by Intel SRC Customization Award 2005-TJ-1318 and NSF CNS-0509379 FMCAD 2006
Hierarchical Cache Coherence Protocols Chip-level protocols Intra-cluster protocols … mem mem dir dir Inter-cluster protocols
Verification Challenges • No public domain benchmarks • More complicated with more • Corner cases • State space
A Multicore Coherence Protocol Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Memory
Protocol Features • Both levels use MESI protocols • Level-1: FLASH • Level-2: DASH • Silent drop on non-Modified cache lines • Network channels are non-FIFO
An Example Scenario Home Cluster Remote Cluster 1 Remote Cluster 2 4.2 Excl Invld 5 1 4.1 Excl Invld 3 2 Excl: 1 1 Req_Ex 2 Req_Ex 3 Fwd_ReqEx 4.1 Fwd_ReqEx 4.2 Silent-drop 5 NACK
Complexity of the Protocol • Multiplicative effect of four protocols running concurrently • Model check failed after 161,876,000 of states
Intuitively, We Want to … • Split a hierarchical protocol into several smaller ones • Verify the smaller protocols • A/G proof
A/G Approach Abstraction Constraining … Original protocol
For Our 2-Level Protocol • Verification by building two smaller protocols • M1 • M2
Abstracted Protocol #1 Home Cluster L1 Cache L1 Cache Remote Cluster 1 Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Memory
Abstracted Protocol #2 Remote Cluster 1 L1 Cache L1 Cache Home Cluster Remote Cluster 2 L2 Cache+Local Dir L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Memory
Verification Methodology • Abstraction • Fixing real bugs in M • Refinement • Counter-example guided refinement • Adding new verification obligations
Abstraction • States • Projection • Transitions • Overapproximation
Abstraction on States Intra-cluster details Inter-cluster details
Abstracting Transitions • Rule-based system: guard action; • Relaxing guards • Relaxing expr values • Remove stmt Procs[p].WbMsg.Cmd = WB_Wb → Procs[p].L2.Data := Procs[p].WbMsg.Data; Procs[p].L2.HeadPtr := L2; … true → Procs[p].L2.Data := d; …
Detecting Bugs in M • When a real error is found in Mi • Fix bug in M • Regenerate Mi’s • Iterate the process
Refinement • When a bogus error found in Mi • Analyze and find out problematic rule g → a • Locate original rule in M G → A • Add a new VO in one abstracted protocol G => P • Strengthen rule into gΛ P →a
Details of Refinement (I) 1. False alarm found • Remote cluster-1 can modify its L2 line arbitrarily 1 M1 true → …
Details of Refinement (II) 2. Locate the original rule in M before abstraction • Guard: when the local dir receives a WB from an L1 cache 1 M1 Procs[p].WbMsg.Cmd = WB → …
Details of Refinement (III) 3. Strengthen problematic rule in 1. • Only when local dir is exclusive, could L2 modify its line 1 3 M1 true & Procs[p].L2.State = Excl → …
Details of Refinement (IV) 4. Why is strengthening sound? 1 3 M1
Details of Refinement (V) M1 4. We can add a new VO in M2 1 3 M2 Procs[p].WbMsg.Cmd = WB => Procs[p].L2.State = Excl 4
Soundness of the Approach • Goal • If M1 and M2 can be model checked correct w.r.t. the coherence property Ф in M, M must also be correct w.r.t Ф
Soundness Proof • Temporal Induction • Initial states • Each common var has the same value in M, M1 and M2 • Each newly added VO is checked in M1 and M2 • Each coherence property is checked • Suppose soundness in state s
Soundness Proof (II) M g a h1’, h2’, r11’, r12’, r21’, r22’ h1, h2, r11, r12, r21, r22 M1 g1 & p1 a1 h1, h2, r12, r22 h1’, h2’, r12’, r22’ M2 g2 & p2 a2 h1, r11, r12, r22 h2’, r11’, r12’, r22’
Experiment Results • A real bug found • 10 iterations of refinements • The size of each error trace is < 12 • One person-day of work
64-bit Murphi IA-64, with 20GB of memory Reduction
Another 2-level hierarchical cache coherence protocol More Results
Conclusion • Developed a 2-level hierarchical protocol • Proposed a compositional approach • Abstraction • Bug fixing • Refinement • Proved the soundness
Related Work • FMCAD’04 • Chou et. al., A simple method for parameterized verification of cache coherence protocols • CHARME’99 • McMillan, Verification of infinite state systems by compositional model checking
For Details http://www.cs.utah.edu/formal_verification/
A Multicore Coherence Protocol Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Memory
Another Decomposing Approach • Split protocols hierarchically • Intra-cluster protocol • Inter-cluster protocol
Intra-cluster Protocol Cluster L1 Cache L1 Cache L2 Cache+Local Dir Environment RAC
Inter-cluster Protocol Remote Cluster 1 Home Cluster Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Memory
About the Bug IACK