Extensible Untrusted Code Verification

Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat

Flexibility for Code Producers • A host receives code from an untrusted agent • Before executing the code, the host wants to verify certain properties (e.g. memory safety) • The host does not want to restrict the code producer… ... to a particular type system ... to particular software conventions • Then how do we verify the code? code ? untrusted trusted

An Untrusted Verifier • The code producer supplies the verifier along with the code code verifier ? verifier untrusted trusted • Too hard to prove correctness of the verifier...

An Untrusted Verifier • The code producer supplies the verifier along with the code OpenVer code verifier  extension verifier extension untrusted trusted • Too hard to prove correctness of the verifier... • Embed the untrusted verifier as an extension in a trusted framework (the Open Verifier)

 next states E The Open Verifier Decoder Core state s trusted untrusted Extension code • instruction at state ssafe if P holds • proceed to next states D • a proof ofP • proceed to next states E and a proof that EcoversD

The Decoder • The decoder is the canonical symbolic evaluator • Examples of decoding a state (pc = 5 Æ A) • The decoder only handles hardware conventions

Soundness and Trustworthiness • We have proven the soundness of the algorithm used by the core of the Open Verifier • The trusted code base (core, decoder, proof checker) could be small and simple • thus easy to trust • We need to ensure the extension is memory safe • Use the extension to verify itself---this one time, run it in a separate address space • What about the extensions? • What does it take to write an extension? • How much can extensions do?

A Type System of Lists • Code producer uses accessible memory for lists • “1” is a list (the empty list) • any even address containing a list is a list • nothing else is a list 16 • Consider the program: s • store a Ã 1 • s Ã read b • if odd(s) then jump 5 • store s Ã a • halt b a 1

Informal Proof Obligations • store a Ã 1 • s Ã read b • if odd(s) then jump 5 • store s Ã a • halt • Proof obligations (informal): “a and b are accessible addresses and if the contents of b after storing 1 at a is even then it is an accessible address” • Too low level • Code producer would prefer instead: “a and b are non-empty lists” • Simpler and easier to prove (using the definition of lists) • How can the code producer achieve this effect ? s b a 1

Typing Rules • store a Ã 1 • s Ã read b • if odd(s) then jump 5 • store s Ã a • halt Initial state: pc = 1 Æ nelist a Æ nelist b Æ inv m Decoder local safety: addr a Decoder next state: pc = 2 Æ nelist a Æ nelist b Æ9m’. inv m’ Æ m = (upd m’ a 1) Extension next state: pc = 2 Æ nelist a Æ nelist b Æ inv m

Typing Rules • store a Ã 1 • s Ã read b • if odd(s) then jump 5 • store s Ã a • halt Initial state: pc = 2 Æ nelist a Æ nelist b Æ inv m Decoder local safety: addr b Decoder next state: pc = 3 Æ nelist a Æ nelist b Æ inv m Æ s = (sel m b) Extension next state: pc = 3 Æ nelist a Æ nelist b Æ inv m Æ list s

Producing Proofs • Using the typing lemmas is completely automatizable • We use a Prolog interpreter where the Prolog program consists of the typing rules: nelist a :- list a, even a • Each individual program is then handled automatically • Proving the typing rules is hard • They become lemmas to be proven by hand in Coq Definition nelist [a:val] := (addr a) /\ (even a). Definition list [a:val] := (a = 1) \/ (nelist a). Lemma rule : (a:val)(list a) ! (even a) ! (nelist a). Proof... • But only need to proven once

How to Construct an Extension • Instantiate all the typing predicates and rules as definitions and lemmas Proofs of Lemmas • Use the typing rules in an automated theorem prover Theorem Prover OpenVer Wrapper • Package the type-checker state into a logical predicate • requires recognizing invariants which may be implicit in the type-checker Type Checker • Could simply import an existing type checker (e.g. a bytecode verifier)

What can extensions do? • Software conventions of stacks and function calls • The program had better use the stack safely • A low-level extension to prove run-time functions • allocator, garbage collector • Working on an extension for the simple object-oriented language Cool

Experience so far... • We have built a prototype implementation • 5500 lines of ML in the trusted framework • 1000 lines to parse Mips assembly code • 2000 lines in the logic and proof checker • 120 lines in the decoder • An extension for the lists example • which also handles the stack and allocation • 1000 lines in the “standard” type checker • 900 lines to package for the OpenVer and tie it all together • 600 lines for the Prolog interpreter • 250 lines for the Prolog rules • ??? lines of Coq proof • compare 4000 lines of Coq script for Java native-code

Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 13, 2003 OSQ Retreat

The extension produces different next states... • generalization • Only need that memory satisfies an invariant, not its contents • re-use of states • loop invariants • A program to effect y Ã y0 + x0 (for positive x) • while (x > 0) do { • x Ã x - 1 • y Ã y + 1 • } • Invariant on line 1: (x ¸ 0 Æ x + y = x0 + y0) • “indirect” states • The decoder can’t handle a state without a literal pc • An indirect jump could implement function return, method dispatch, switch statement, exception handling...

Extensible Untrusted Code Verification