270 likes | 292 Views
Translation Validation of Compilers for Model-based Programming. Supratik Mukhopadhyay supratik@csee.wvu.edu. Research Heaven, West Virginia. - 37.5 %. - 25 %. - 75 %. - 50 %. Why Model-based Programming?.
E N D
Translation Validation of Compilers for Model-based Programming Supratik Mukhopadhyay supratik@csee.wvu.edu Research Heaven,West Virginia
- 37.5 % - 25 % - 75 % - 50 % Why Model-based Programming? • Most effective way to amortize software development cost is to make the software plug and play • Immobots programmed by specifying component models of hardware and software behavior to support plug and play • Development of model libraries reduces design time, facilitate reuse and amortize modeling costs • Reduces sensitivity to modeling inaccuracies and hardware errors • Validation can be done in an early phase
Model-based Developmentat NASA • Much publicized use of Remote Agent autonomy architecture used in Deep Space • Mode Identification and Recovery (MIR) component uses Lisp-based Livingstone (L1) Integrated Vehicle Health Management (IVHM) system • Accepts models of components of system; infers overall behavior of system • Being used in the next-generation shuttle project for Vehicle health management
Livingstone: How it works Livingstone (L2) Source in C++ Are these translations correct? C++ Compiler Model in JMPL Model in XMPL System Behavior Livingstone Executable JMPL Compiler
In other words… Is the right model getting fed to Livingstone? Is Livingstone correctly inferring behavior of the system?
Things can go wrong… for(i=0; i<=max; i++){ … } i=0 For implementations disregarding arithmetic overflows to improve performance, loop may not terminate i++ 0<=i<=max no yes …
Things can go wrong… Actual machines have finite stack sizes while programming languages have unbounded recursion
Why do we care? Livingstone (L2) Source in C++ Validate these C++ Compiler Model in JMPL Model in XMPL System Behavior Livingstone Executable JMPL Compiler Validating high level source code useless if correctness does not transfer to Machine code that is finally executed
Why Validate Translations? • Mistrust in compilers is one of the reasons why safety-critical software certified at the level of machine or assembly code. Results: • increased time and cost • error-prone • difficult to maintain; no modularity • difficult to reuse • Vulnerability to ‘self-modifying’ code • Question: • How to bridge such a huge gap in the software development cycle?
Why ValidateTranslations? • Answer 1: • Hoare, Mueller-Olm et. al.: Verify the compiler. • Feasible?? • Too complicated; too much details • Equally time-consuming and costly • ‘Freezes’ updates to compiler • Answer2: • Validate each run of the compile individually • Manageable; do not have to go to the low level compiler details • Independent of the particular compiler; depends only on source and target languages
Why model-based landscape is so special? • Involves Concurrency and Components • embedded and real-time aspects • More high-level than traditional programs • Procedural (Livingstone C++) • Object-oriented (source of L2) • .Declarative (JMPL) • Object-oriented to unstructured • Declarative to declarative • Declarative to Procedural (e.g., MPL to SMV) • Dynamics • Optimizations
Assigns correct target programs To AST’s Which parts are important? The most interesting stage where bugs are most likely Generate Code Scan Parse Source Code Target Code
So what do we need? • Source code and target code represented using a common semantic framework • Establish refinement mapping from target code to source code • Consideration: • XMPL is in prefix notation • Consideration: • In the containers for “equals”, “or” etc., XMPL allows n-ary arguments whereas JMPL allows 2 arguments
Translation Validation Technology Developed • Use a symbolic logical semantic framework; we use Quantified Propositional Temporal Logic (QPTL) with fixpoints (for loops) • Translate both source and target program to their logical semantics (QPTL formulas) • Developed an automatic tool to generate logical semantics from C++ source code; Can handle multi-threading in the source program • Developed a classification methodology for acceptable and unacceptable failures in target program
Translation Validation Technology Illustrated • Tool obtains logical semantics (QPTL) formulas from C++ source code bottom-up Φ x=e; ψ A= Set of acceptable failures φ = ◊(A \/ ψ[x->e])
Establishing Refinement Mapping • Refinement = Logical semantics of target code entails that of the source code • Refinement checking done using a tool called Temporal Logic Verifier (TLV) • TLV implements decision procedure for QPTL but not for the fixpoint part • TLV programmable; implementing the decision procedure for the fixpoint part on top of TLV in TLV-Basic Counterexample Yes TLV Automatic Tool Refinement Calculus Abstract Frame work Source Code Target Code
Refinement of Source Code • Tool built using Lex/Yacc and 500 lines of Awk code • Used our tool to automatically generate logical semantics of methods in L2 code written in C++ • 1000 lines of code handled in less than 10 seconds • Currently refinement calculus for JMPL being implemented
Abstraction of Target Code • Currently developing abstraction calculus for assembly and machine language of Pentium-4 • Abstraction calculus for XMPL being implemented
State Space Explosion • Abstraction and Refinement leads to state explosion • Need to be less ambitious • More “abstract” methods coming up
New Methods for Refinement Checking • Randomized refinement checking • at each branching point pretend that go along all branches with different probabilities • Bounded Refinement Checking and Refinement Testing • Bound the size of the models built by TLV. Experiments show that faster in finding counterexamples • Generate automatically (based on the specifications of the source code) a sequence of models and check whether they are counterexamples;
Validating Compiler Optimizations • Optimizations potential cause for introducing errors • Code motion can convert a terminating program to a non-terminating • one and vice-versa Most compiler optimizations conveniently represented as rewrite rules of the form: Φ is a logical condition I → I’, φ
Rewriting and Static Analysis Optimizer Source Code Optimized Code • Developed a preliminary tool for validating compiler optimizations • combining rewriting and static analysis • Binds free variables in conditions to program • locations and program variables
Translation Validation: System Architecture Source Code Counterexample Refinement tool Compiler Bad Abstraction tool Translation Validator Target Code TLV Proof Script Rudimentary Proof Checker Fault indication (Not OK) OK
Current status • Automatic tool for logical semantics of C++ code • implemented • Abstraction calculus for Pentium 4 assembly code developed • currently under implementation • Preliminary tool for validating compiler optimizations • implemented • Refinement calculus for JMPL developed • to be implemented • Experiments • new methods for refinement checking conducted • Found bounded refinement cheking to be faster in some cases • Preliminary case studies • Livingstone source code • Translated several methods of Livingstone to their logical semantics • Maximum ~ 1400 lines taking < 12 seconds
To do… (next quarter) • Developing and implementing • abstraction calculus for XMPL and Pentium 4 machine language • Studying and developing abstraction calculus • for Power PC machine language • Completing the pending implementations • More rigorous case studies
Related Work • Translation Validation for Synchronous Languages (Pnueli et. al) • Proof-carrying compilation (Necula et. al) • Compiler verification (Hoare, Mueller-Olm et. al)
Lessons learnt • Semi-automatic tools for translation validation possible • Features of model-based programming both provide advantages (less data dependency) and disadvantages (communication) • Use a combination of techniques • Supratik’s law • Software reliability can be transferred from source to target code (reliability can be compiled)