380 likes | 600 Views
Towards Certified Compositional Compilation for Concurrent Programs. Hanru Jiang, Hongjin Liang , Xinyu Feng University of Science and Technology of China. Compositional Compilation. S1. S 2. interaction. Source (e.g. C). Compiler 1. Compiler 2. interaction. T1. T2. Target
E N D
Towards Certified Compositional Compilation for Concurrent Programs Hanru Jiang, Hongjin Liang, Xinyu Feng University of Science and Technology of China
Compositional Compilation S1 S2 interaction Source (e.g. C) Compiler 1 Compiler 2 interaction T1 T2 Target (e.g. Asm) Real-world source programs may consist of multiple components, which will be compiled independently.
Concurrent Program Compilation Parallel Composition S1 S2 Source (e.g. C) Compiler 1 Compiler 2 Parallel Composition T1 T2 Target (e.g. Asm) How to specify/verify correctness?
Motivating example – OS Verification concurrency separate compilation mov … … ret main(…){ … } C + inline assembly + main: mov … add … identical transformation compilation mov … … ret + assembly
Compositional CompCert [Stewart et al POPL’15, Beringer et al ESOP’14] Does not have concurrency! External Function Calls Module S1 • Module S2 Compiler 2 Compiler 1 External Function Calls • Module T1 • Module T2
Our Work • A framework for certified compositional compilation for race-free concurrent programs • Key semantics components + proof structures • (Ongoing) Coq implementation: extend Compositional CompCert to concurrency
Outline of this talk • Background • Semantics components and Proof structures • Footprint-preserving simulation
Background - Compilation Correctness for Closed Programs S Source (e.g. C) Correct(Compiler): S, T. T = Compiler(S) T S Compiler Semantic preservation: T has no more observable behaviors (e.g. I/O events by print) than S. T Target (e.g. Asm)
Compilation Correctness Proof by Simulations (as in CompCert) [Leroy et al.] • zero-or-multiple steps • Source state e * * … (S, ) (S’, ’) (S’’, ’’) • observable event (e.g. I/O) Simulation • Target state e (T’, ’) (T’’, ’’) (T, ) … For closed programs only. NOT compositional.
Background - Compositional CompCert [Stewart et al POPL’15, Beringer et al ESOP’14] External Function Calls Module S1 • Module S2 Compiler 2 Compiler 1 External Function Calls • Module T1 • Module T2
Background - Compositional CompCert // Module S2 void g(int *x){ *x = 3; } // Module S1 extern void g(int *x); int f(){ int a = 0, b = 0; g(&b); return a + b; } Interaction occurs at external calls. Interaction points at source & target are aligned. External modules may access shared resources. // Module T2 ... // Module T1 ... g(&b); ... Optimizations should not go beyond external calls unless only local variables get involved
Simulations in Compositional CompCert [Stewart et al POPL’15, Beringer et al ESOP’14] … (S, ) (S’, ’) (S’, ’’) (S’’, ’’’) * * (T’, ’) … (T, ) (T’, ’’) (T’’, ’’’) Idea comes from RGSim[Liang et al. POPL’12] EnvR&r modeling general callee behaviors External Call External Call env r env R Compositionalw.r.t. module linking
Compositional Compilation Correctness in Compositional CompCert [Stewart et al POPL’15, Beringer et al ESOP’14] T1 | T2 S1 | S2 Module linking: yield control at external calls only T1 | T2 S1 | S2 Compositionality T1 S1 T2 S2 : simulation : refinement (semantic preservation)
Background - Compositional CompCert // Module S2 void g(int *x){ *x = 3; } // Module S1 extern void g(int *x); int f(){ int a = 0, b = 0; g(&b); return a + b; } Interaction occurs at external calls. Interaction points at source & target are aligned. NOT work for general concurrency: 1) interaction between modules (threads) may occur at ANY program point; 2) target may have more interleaving than source (interaction points at source and target are not aligned). // Module T2 ... // Module T1 ... g(&b); ...
Gap between Compositional CompCert and Concurrency … (S, ) (S’, ’) (S’, ’’) (S’’, ’’’) * * (T’, ’) … (T, ) (T’, ’’) (T’’, ’’’) Concurrency: Interaction may occur at any time, and more often at the target than the source External Call External Call env r env R
How to support concurrency? Proposal in Compositional CompCert Race-free programs running in a nonpreemptive semantics soundly approximate race-free programs in an interleaving semantics [Beringeret al ESOP’14, p 10]
Data-Race-Freedom (DRF) A race occurs if two threads access the same location at the same time and at least one of the access is an update Race || x = 2; x = 1; Race || r = x; x = 1; No race || r2 = x; r1 = x; lock.acq(); x = 1; lock.rel(); lock.acq(); x = 2; lock.rel(); No race lock.acq(); x = 1; lock.rel(); Race defined based on SC semantics. Not the same race as in C11 memory model. Race x = 2;
Folklore theorem: DRF programs in an interleaving semantics behave the same as in a non-preemptive semantics Interleaving r1 = 1; lock.acq(); x = r1 + 1; y = x + 1; lock.rel(); r1 = 1; yield; x = r1 + 1; y = x + 1; yield; r2 = 2; lock.acq(); x = r2 + 1; y = x + 1; lock.rel(); r2 = 2; yield; x = r2 + 1; y = x + 1; yield; No race Non-preemptive: yield control at certain points only Result: x = 2, y = 3; or x = 3, y = 4 sequential (Sequential) Compositional CompCert is now sound! sequential
How to support concurrency? Proposal in Compositional CompCert Race-free programs running in a nonpreemptive semantics soundly approximate race-free programs in an interleaving semantics A convincing argument, but w/o formal proofs. The proof is non-trivial! [Beringeret al ESOP’14, p 10]
Outline of this talk • Background • Semantics framework • Semantics components and Proof structures • Footprint-preserving simulation
Compositional CompCert for Race-Free Concurrency r1 = 1; lock.acq(); x = r1 + 1; y = x + 1; lock.rel(); r2 = 2; lock.acq(); x = r2 + 1; y = x + 1; lock.rel(); Interactions between modules occur only at the boundary of critical regions (CRs)(i.e., treat lock.acq() and lock.rel() as external fun. calls ) Compile as sequential code How to prove the correctness?
Compositional CompCert for Race-Free Concurrency r1 = 1; lock.acq(); x = r1 + 1; y = x + 1; lock.rel(); r2 = 2; lock.acq(); x = r2 + 1; y = x + 1; lock.rel(); • T1 Comp(S1) • T2 Comp(S2) • DRF(S1 || S2) Compile as sequential code • T1 || T2 S1 || S2 How to prove the correctness?
Key semantics components Concurrency and data-race freedom: sets of mem. locations for reads and writes. • S1 || S2 Interleaving semantics, labeled with footprints • DRF(S1 || S2) Defined in terms of footprintsdisjointness Non-preemptive semantics: • S1 | S2 Threads yield control only at boundary of CRs, transitions also labeled with footprints • DRF(S1 | S2) Defined in terms of footprintsdisjointness
Proof Structures ? • T1 || T2 S1 || S2 ? • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Proof Structures ? • T1 || T2 S1 || S2 ? • T1 | T2 S1 | S2 • T1 | T2 S1 | S2 Reusing Compositional CompCert • T1 S1 • T2 S2 • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Proof Structures ? • T1 || T2 S1 || S2 ? • DRF(T1 || T2) trivial Folklore theorem • T1 | T2 S1 | S2 • T1 | T2 S1 | S2 • T1 S1 • T2 S2 • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Proof Structures ? • T1 || T2 S1 || S2 ? • DRF(T1 || T2) trivial Folklore theorem • T1 | T2 S1 | S2 ? Simulation ensures DRF-preservation? • DRF(T1 | T2) • T1 | T2 S1 | S2 ? • T1 S1 • T2 S2 • DRF(S1 | S2) • Cannot derive the DRF preservation from Compositional CompCert! • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Proof Structures ? • T1 || T2 S1 || S2 ? • DRF(T1 || T2) trivial Folklore theorem • T1 | T2 S1 | S2 ? Simulation ensures DRF-preservation? • DRF(T1 | T2) • T1 | T2 S1 | S2 ? • T1 S1 • T2 S2 • DRF(S1 | S2) • Idea: introduce a new footprint-preserving simulationto ensure DRF-preservation. • Cannot derive the DRF preservation from Compositional CompCert! • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Outline of this talk • Background • Semantics components and Proof structures • Footprint-preserving simulation
How to prove DRF-preservation? • DRF(S1 | S2) • T1 S1 • T2 S2 • DRF(T1 | T2) However, in Compositional CompCert does not ensure DRF-preservation
Our Solution: Footprint-Preserving Simulations … (S, ) (S’, ’) (S’, ’’) (S’’, ’’’) * * FP FP (T’, ’) fp fp … (T, ) (T’, ’’) (T’’, ’’’) • Footprint (sets of locations being read & write) Yield Yield env r env R
Our Solution: Footprint-Preserving Simulations • T1 S1 • T2 S2 (Compositionality) • T1 | T2 S1 | S2 as in Compositional CompCert • T1 | T2 S1 | S2 • DRF(S1 | S2) (DRF-preservation) • DRF(T1 | T2)
Proof Structures ! ? • T1 || T2 S1 || S2 ! ? • DRF(T1 || T2) trivial Folklore theorem ! • T1 | T2 S1 | S2 ? • DRF(T1 | T2) • T1 | T2 S1 | S2 ! ? • T1 S1 • T2 S2 • DRF(S1 | S2) • DRF(S1 || S2) • T1 Comp(S1) • T2 Comp(S2)
Coq Implementation (Ongoing) • Mostly reuse Compositional CompCert proofs • Add footprint in languages (Clight, Cminor, RTL, Asm, …) • Extend simulation definition with footprint preservation • Extend compositionality proofs & compilation correctness proofs + DRF preservation proofs + Proof of the folklore theorem (DRF programs in the interleaving semantics behave the same as in the non-preemptive semantics)
Conclusion • A framework for certified compositional compilation for DRF programs • A compositional footprint-preserving simulation that gives DRF-preservation
Thank you! Questions?
Full Framework • T1 || T2 S1 || S2 • DRF(T1 || T2) • T1 | T2 S1 | S2 • NPDRF(T1 | T2) • T1 | T2 S1 | S2 Det(T1) Det(T2) • NPDRF(S1 | S2) • T1 | T2 S1 | S2 • DRF(S1 || S2) • T1 S1 • T2 S2