480 likes | 660 Views
An Example of Translation and Proof using Higher-Order Abstract Syntax. Michael W. Whalen Advanced Technology Center Rockwell Collins Inc. Safety-Critical Systems. Code Generation Requirements. Automatic Formally-Defined Formal description of source/target language
E N D
An Example of Translation and Proof using Higher-Order Abstract Syntax Michael W. Whalen Advanced Technology Center Rockwell Collins Inc.
Code Generation Requirements • Automatic • Formally-Defined • Formal description of source/target language • Proof that generated code implements specification • Correctly-Implemented • Transparent transliteration of translation rules • Implementation should be rigorously tested • Usable for Safety-Critical Systems • Human-Understandable and traceable • Necessary for fault analysis, code instrumentation • Required by regulatory agencies • Fast enough for target environment
2. Formal Architecture How do we create a formal translation approach from foundations? 1. Foundations Language Semantics and Proof 3. Application Applying Semantics and Proofs to RSML-e Translator Aspects of Translation 4. Implementation Designing a Translator that transparently implements rules
RSML-e Semantics Output Proof: Same Outputs Generated Program Semantics Output Formal Definition of Compiler Correctness RSML-e Syntax Compiler Definition Program Syntax
Operational Semantics and Proof • Operational semantics provides framework for evaluation, static semantics, and transformations • Several different “flavors” of operational semantics • SOS, Natural Semantics, Abstract Machines • We want formalism that leads to elegant transformations and proofs
Managing Identifiers • Large part of translation and proof complexity • Explicit Environments • “Environment Carrying” functions [Plotkin: SOS, Despyroux: Mini-ML] • Renaming over scopes [Drossopoulou: Java] • Implicit Environments • Substitution as meta-rule [Pierce PL Book] • Lambda variables in object language • Metalevel support [Hannon93, Whalen05] • Lambda variables in metalanguage • Proofs describing substitution behavior provided by metalogic
A B Example: Extended Natural Semantics Example Concrete Syntax: Higher-Order Abstract Syntax: function sum(y: int; z: int) : int { return y + z; } (function_def int (param int (λy. (param int (λz. (body (binary_expr (lit_expr y) plus (lit_expr z)))))))) Evaluation Rules:
A Example: Extended Natural Semantics: Typing Higher-Order Abstract Syntax: Typing Rules: (function_def int (param int (λy. (param int (λz. (body (binary_expr (lit_expr y) plus (lit_expr z)))))))).
is transformed into: (function_def int (param int (λy. (param int (λz. (body (binary_expr (unary_expr minus (lit_expr y)) plus (unary_expr minus (lit_expr z))… Extended Natural Semantics: Transformation Higher-Order Abstract Syntax: Transformation Rules: (function_def int (param int (λy. (param int (λz. (body (binary_expr (lit_expr y) plus (lit_expr z))…
Apply trans rule here. Rule premises define how Body is transformed to Body' Body[z := c] Body'[z := c] Several functions may match Body', replacing zero or more instances of c with z Applying cFor z in Body (λz.<Body>) c { (λz.<Body''>) c, (λz.<Body'''>) c, …} Instantiating new constant c for x However, only one function can match the , because the c must be new: it cannot exist outside the scope of the , so all c's must be replaced by z's. ENS Transformation, Expanded ( x ((λz.<Body>) x) ((λz.<Body*>) x))
2. Formal Architecture How do we create a formal translation approach using foundations? 1. Foundations Language Semantics and Proof 3. Application Applying Semantics and Proofs to RSML-e Translator Aspects of Translation 4. Implementation Designing a Translator that transparently implements rules
RSML Semantics Rules Output Are rules deterministic? Are rules complete? Program Semantics Rules Output Notions of Completeness and Determinism Source Syntax Compiler Rules Program Syntax
Correctness Obligations for SOS Rules • Despeyroux’s obligations: • Obligations for deterministic language: • Obligations are equivalent if source semantics are complete.
Translation in Layers Semantics Rules RSML-e Completeness Proofs Translation Rules … … C, Ada, Java, …
Evaluation Rules in Translation Source AST Grammar Target AST Grammar New Syntax if expr then v_exprelse v_expr ... v_expr ::= unknown | id(expr list) | expr. ... ... v_expr ::= if expr then v_expr else v_expr | expr. ... Evaluation rules for new syntax: Source Evaluation Rules Rules for Removed Syntax - Target Evaluation Rules
Translation Proof Structure • Describe the correctness of contexts: • Describe equivalence of program states: • Describe completeness obligation using evaluation rules for source and target languages + transformation rules:
2. Formal Architecture How do we create a formal translation approach from foundations? 1. Foundations Language Semantics and Proof 3. Application Applying Semantics and Proofs to RSML-e Translator Aspects of Translation 4. Implementation Designing a Translator that transparently implements rules
Source Language: RSML-e • RSML-e is a Reactive Synchronous Dataflow Language • Reactive: Specification reacts to changes in external environment at discrete intervals • Synchronous: those reactions take (logically) zero time • Dataflow: value of object (variable or interface) can be computed as soon as objects on which it is dependent have been computed. • Specification consists of Variables and Interfaces • Variables maintain internal state of model • Interfaces describe interaction with the external environment • Two-state model • Values of variables from previous step can be referenced
Source Language: RSML-e Input Frames: Output Frames: Reset_Receiver Clock Fault_Sender Clock <empty> Altitude Switch Specification Frame Being Evaluated: ... ... Evaluation Result: Clock DOI_Receiver Clock DOICmd_Sender DOI_Receiver DOICmd_Sender ... Reset_Receiver Clock
Source Language: RSML-e • Each variable or interface has an assignment:
Translation: Intermediate Languages • We move the language successively closer to an imperative language • RSMLp : We move from the RSML-e synchronous specification language to a synchronous programming language: remove undefined and case lists. • RSMLt : Switch from a structural to a nominal type system • RSMLv: Switch from two-state variables to one-state variables • SIMPLr: Add imperative, rather than functional, assignments to variables (subset of Ada) • SIMPL: Remove record assignments from SIMPLr(subset of C, Java)
Example: RSML-eto RSML-p • This transformation does two things: • Replaces assignment case lists with assignment expressions • Removes undefined_val from the type system • To remove undefined_val we transform all variables in the specification • var x : T; becomes var x : record{ val: T, def: Boolean };
Transformation Rules • Expressions • Declarations
Proof Obligations Context Relation: State Variable Value Similarity Relation: State Relation:
Proof Obligation: Expressions Expression Obligation: Lemma about deref:
From deref Lemma: From definition of ≈, and from premise Vals≠ undefined_val, Valt = with V2 = Vals. Now, we can derive: Example Proof: pre_expr Transformation Rule: RSML-e Evaluation Rule:
2. Formal Architecture How do we create a formal translation approach from foundations? 1. Foundations Language Semantics and Proof 3. Application Applying Semantics and Proofs to RSML-e Translator Aspects of Translation 4. Implementation Designing a Translator that transparently implements rules
Implementation • Prototype Translator In λProlog • Transparently Implements ENS Rules becomes…
Implementation • Translator Stats • Source Code: @ 100KB in 27 source/header files
Implementation • Translation Results • Teyjus Needs Garbage Collection!
Discussion • Original work was in first-order system • Used ID-substitution (Drossopoulou) • Requires additional rules describing which ids should be substituted (e.g. no record fields) • Required significant additional lemmas about how terms behave under id substitutions • I was struggling to complete proofs (and bored) due to sheer number of details related to identifiers
Discussion • HOAS and λProlog made my dissertation much more straightforward • Language descriptions became simpler • Translation became much simpler • Use of implication allowed immediate and simple constructions of compiler environment • Relations over correct environments are straightforward to construct • Proofs became much simpler • No substitution lemmas [Pierce, Despyroux] • Proofs 2-3x shorter
Binding I: Removing Names • One goal of HOAS: make identifier names irrelevant • I was not totally able to do this: • Record fields still keyed by id • λ-bindings assume a specific order – record expressions allow arbitrary order • Question is it possible / a good idea to remove field identifiers?
Binding II: Adding Variables • Translation from higher-level to lower-level language often requires introduction of new variables • Difficult to motivate translation rules at first • Led to some odd rule constructions where bindings and code were constructed “in parallel” • Example: moving from a language with record-creation expressions (a la ML) to one that does not (a la C)
Given:type a = record { f1 : int, f2 : real } ; Want to change something like: [f1 : 2+y, f2 : 3.1] Into:create_a(2+y, 3.1) Need to create:fun create_a( f1 : int, f2 : real): a = var r_result : a ; in r_result.f1 = f1 ; r_result.f2 = f2 ; return r_result ; end Remove Record Expressions Example
Remove Record Expressions Example Rule: create_type_fn_body Var Type Fields StmtList Block - Var is the fresh constant bound to the r_result local variable - Type is the return type of Var - Fields describes the remaining fields to be assigned within the record - StmtList defines the field assignments performed thus far - Block is the returned function block
Binding II: Adding Variables • Similar project at RCI: Translating Lustre to several languages (NuSMV, PVS, SAL) • Lustre supports PRE-operator that allows reference to previous values of variables • Fibonacci: x = pre(pre(x, 0), 0) + pre(x,1) ; • To translate to C, we must introduce additional variables for each pre-operator • Seems tricky to do in HOAS!
Binding III: Non-Lexical Scoping • Many languages allow forward references to identifiers • Java • Lustre/SCADE • I changed the RSML-e semantics to disallow forward references • (How) Can we represent “global” scopes in HOAS? • Alternately, can we add environments for “global” ids and still get most of the HOAS benefits?
Working in a Positivist Logic • It would be difficult to write semantics and translator entirely without the use of cut • List non-membership in static semantics • Evaluation rule for not-equal expressions • Occasional use of set data structure • Cuts were not used in rules that referenced structures that could contain meta-level variables or universal constants • These uses could affect correctness of reasoning • How will my use of cut affect reasoning in a formal framework?
Tool Support • λProlog gripes • No syntax for naming commonly used types – makes for long type descriptions • Syntax allows misplaced comma to conjoin two rule instances; • New symbol for reverse implication in rule instance? (<- ) • New rule begins with turnstile? (|- ) • Implication (=>) binds tighter than and (,) • Teyjus gripes • No garbage collector • No warnings on single use of variable • No warnings on rule declaration without definition • No warnings on non-use of bound variable within term • No debugger
Conclusion • Formal approach can be used for real translators • Difficulty is dependent on choice of formalism • Original work was in natural semantics • Much simpler with extended natural semantics • Some things are still tricky to do in HOAS • A few improvements to tools would really benefit serious users
Conclusion • SIMPL – “Small Imperative Language” semantics may be useful to others • I didn’t want to write it • YAILS - boring • However, I needed a small subset of Ada/Java/C • Literature semantics are cleaner, but no clear correspondence to “real” languages • Supports basic records, arrays, block structuring, functions • Recursion could be added easily • However, matching C/Java syntax for recursion would be harder
Future Work • Generalizing work to other source languages • Lustre, SCR • Adding other target languages • Extensive testing (if actually to be used on DO178B development effort) • Teyjus Improvements • Optimizations
Contact Information • Crisys Research Group • on the web: http://www.cs.umn.edu/crisys • Mike Whalen • e-mail: mwwhalen@rockwellcollins.com • phone: (612) 625-4543 • Mats P.E. Heimdahl • e-mail: heimdahl@cs.umn.edu • phone: (612) 625-2068