1 / 76

Building “Correct” Compilers

Building “Correct” Compilers. K. Vikram and S. M. Nazrul A. Outline. Introduction: Setting the high level context Motivation Detours Automated Theorem Proving Compiler Optimizations thru Dataflow Analysis Overview of the Cobalt System Forward optimizations in cobalt

denver
Download Presentation

Building “Correct” Compilers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building “Correct” Compilers K. Vikram and S. M. Nazrul A.

  2. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  3. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  4. Introduction The Seven Grand Challenges • In Vivo  In Silico • Science for Global Ubiquitous Computing • Memories for Life • Scalable Ubiquitous Computing Systems • The Architecture of the Brain and Mind • Dependable Systems Evolution • Journeys in Non-classical computations

  5. Introduction The Seven Grand Challenges • In Vivo  In Silico • Science for Global Ubiquitous Computing • Memories for Life • Scalable Ubiquitous Computing Systems • The Architecture of the Brain and Mind • Dependable Systems Evolution • Journeys in Non-classical computations

  6. Introduction Dependable Systems Evolution • A long standing problem • Loss of financial resources, human lives • Compare with other engineering fields! • Non-functional requirements • Safety, Reliability, Availability, Security, etc.

  7. Introduction Why the sudden interest? • Was difficult so far, but now … • Greater Technology Push • Model checkers, theorem provers, programming theories and other formal methods • Greater Market Pull • Increased dependence on computing

  8. Introduction A small but significant step Building Correct Compilers

  9. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  10. Motivation Why are correct compilers hard to build? • Bugs don’t manifest themselves easily • Where is the bug – program or compiler? • Possible solutions • Check semantic equivalence of the two programs (translation validation, etc.) • Prove compilers sound (manually) • Drawbacks? • Conservative, Difficult, Actual code not verified

  11. DIFF Motivation Testing Compiled Prog Source compiler input output exp- ected output run! • To get benefits, must: • run over many inputs • compile many test cases • No correctness guarantees: • neither for the compiled prog • nor for the compiler

  12. Semantic DIFF Motivation Verify each compilation Compiled Prog Source compiler • Translation validation • [Pnueli et al 98, Necula 00] • Credible compilation • [Rinard 99] • Compiler can still have bugs. • Compile time increases. • “Semantic Diff” is hard.

  13. Correctness checker Motivation Proving the whole compiler correct Compiled Prog Source compiler

  14. compiler Correctness checker Motivation Proving the whole compiler correct • Option 1: Prove compiler correct by hand. • Proofs are long… • And hard. • Compilers are proven correct as written on paper. What about the implementation? Correctness checker Link? Proof Proof Proof «¬  $  \ r t  l / .

  15. Motivation gcc-bugs mailing list Searched for “incorrect” and “wrong” in the gcc-bugs mailing list. Some of the results: • c/9525: incorrect code generation on SSE2 intrinsics • target/7336: [ARM] With -Os option, gcc incorrectly computes the elimination offset • optimization/9325: wrong conversion of constants: (int)(float)(int) (INT_MAX) • optimization/6537: For -O (but not -O2 or -O0) incorrect assembly is generated • optimization/6891: G++ generates incorrect code when -Os is used • optimization/8613: [3.2/3.3/3.4 regression] -O2 optimization generates wrong code • target/9732: PPC32: Wrong code with -O2 –fPIC • c/8224: Incorrect joining of signed and unsigned division • … And this is only for February 2003! On a mature compiler!

  16. Motivation Need for Automation compiler • This approach: proves compiler correct automatically. Correctness checker Automatic Theorem Prover

  17. Automatic Theorem Prover The Challenge This seems really hard! Task of proving compiler correct Complexity of proving a compiler correct. Complexity that an automatic theorem prover can handle.

  18. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  19. Automated Theorem Proving Brief detour thru ATP • Started with AI applications • Reasoning about FOL sound and complete • 1965: Unification and Resolution • Combinatorial Explosion. SAT (NP-Complete) and FOL (decidable) • Refinements of Resolution, Term Rewriting, Higher order Logics • Interactive Theorem Proving • Efficient Implementation Techniques • Coq, Nuprl, Isabelle, Twelf, PVS, Simplify, etc.

  20. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  21. Optimizations Focus on Optimizations • Optimizations are the most error prone • Only phase that performs transformations that can potentially change semantics • Front-end and back-end are relatively static

  22. Optimizations Common Optimizations • Constant Propagation: replace constant valued variables with constants • Common sub-expression elimination: avoid recomputing value if value has been computed earlier in the program • Loop invariant removal: move computations into less frequently executed portions of the program • Strength Reduction: replace expensive operations (multiplication) with simpler ones (addition) • Dead code removal: eliminate unreachable code and code that is irrelevant to the output of the program

  23. Optimizations Constant Propagation Examples

  24. Optimizations Constant Propagation Condition • Suppose x is used at program point p • If • on all possible execution paths from START of procedure to p • x has constant value c at p • then replace x by c

  25. Optimizations The Analysis Algorithm • Build the control flow graph (CFG) of the program • Make flow of control explicit • Perform symbolic evaluation to determine constants • Replace constant-valued variable uses by their values and simplify expressions and control flow

  26. Optimizations Building the CFG

  27. Optimizations Building the CFG • Composed of Basic Blocks • Straight line code without any branches or merges of control flow • Nodes of CFG • Statements (basic blocks)/switches/merges • Edges of CFG • Possible control flow sequence

  28. Optimizations Symbolic Evaluation • Assign each variable the bottom value initially • Propagate changes in variable values as statements are executed • Based on the idea of Abstract Interpretation

  29. Optimizations Symbolic Evaluation • Flow Functions • x := e state@out = state@in{eval(e, state@in)/x} • Confluence Operation • join over all incoming edges

  30. Optimizations Symbolic Evaluation • Flow Functions • x := e state@out = ƒ (state@in) • Confluence Operation • join over all incoming edges

  31. Optimizations The Dataflow analysis algorithm • Associate one state vector with each edge of CFG. Initialize all entries to • Set all entries on outgoing edge from START to • Evaluate the expression and update the output edge • Continue till a fixed point is reached

  32. Optimizations Example Evaluation

  33. Optimizations Termination Condition • If each flow function ƒ is monotonic • i.e. x ≤ y => ƒ (x) ≤ ƒ (y) • And if the lattice is of finite height • The dataflow algorithm terminates

  34. Optimizations Other Optimizations All Paths Any Path Forward Flow Backward Flow

  35. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  36. Automatic Theorem Prover Overview Making the problem easier Task of proving compiler correct

  37. Automatic Theorem Prover Overview Making the problem easier Task of proving optimizer correct • Only prove optimizer correct. • Trust front-end and code-generator.

  38. Automatic Theorem Prover Overview Making the problem easier Task of proving optimizer correct Write optimizations in Cobalt, a domain-specific language.

  39. Automatic Theorem Prover Overview Making the problem easier Task of proving optimizer correct Write optimizations in Cobalt, a domain-specific language. Separate correctness from profitability.

  40. Automatic Theorem Prover Overview Making the problem easier Task of proving optimizer correct Write optimizations in Cobalt, a domain-specific language. Separate correctness from profitability. Factor out the hard and common parts of the proof, and prove them once by hand.

  41. Overview The Design Interpreter Input Output Cobalt Program

  42. Overview The Design

  43. if (…) { x := …; } else { y := …; } …; Overview The Compiler Front End Source Code 10011011 00010100 01101101 Back End Binary Executable

  44. Overview Results • Cobalt language • realistic C-like IL, operates on a CFG • implemented const prop and folding, branch folding, CSE, PRE, DAE, partial DAE, and simple forms of points-to analyses • Correctness checker for Cobalt opts • using the Simplify theorem prover • Execution engine for Cobalt opts • in the Whirlwind compiler

  45. Overview Cobalt  Rhodium  ?

  46. Overview Caveats • May not be able to express your opt Cobalt: • no interprocedural optimizations for now. • optimizations that build complicated data structures may be difficult to express. • A sound Cobalt optimization may be rejected by the correctness checker. • Trusted computing base (TCB) includes: • front-end and code-generator, execution engine, correctness checker, proofs done by hand once

  47. Outline • Introduction: Setting the high level context • Motivation • Detours • Automated Theorem Proving • Compiler Optimizations thru Dataflow Analysis • Overview of the Cobalt System • Forward optimizations in cobalt • Proving Cobalt Optimizations Correct • Profitability Heuristics • Pure Analyses • Concluding Remarks

  48. REPLACE Forward Optimizations Constant Prop (straight-line code) y := 5 statement y := 5 statements that don’t define y x := y x := 5 statement x := y

  49. REPLACE Forward Optimizations Adding arbitrary control flow if statement y := 5 y := 5 y := 5 y := 5 is followed by statements that don’t define y until x := y x := 5 statement x := y then transform statement to x := 5

  50. Forward Optimizations Constant prop in English if statement y := 5 is followed by statements that don’t define y until statement x := y then transform statement to x := 5

More Related