1 / 50

Proof-Carrying Code

Proof-Carrying Code. Programmable mobile devices. By 2003, one in five people will own a mobile communications device. Nokia expects to sell 500M Java-enabled phones in 2003. Most of these devices will be power and memory limited. Mobile/Wireless Devices.

ciel
Download Presentation

Proof-Carrying Code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proof-Carrying Code

  2. Programmable mobile devices • By 2003, one in five people will own a mobile communications device. • Nokia expects to sell 500M Java-enabled phones in 2003. • Most of these devices will be power and memory limited.

  3. Mobile/Wireless Devices • In ‘97, 101M mobile phones vs 82M PCs. (40% vs 14%.) • 95% phones will be WAP enabled by ‘04. • 64Mbits of RAM in 2002. • Battery life a primary factor. • Efficiency and bandwidth will still be precious.

  4. Cheese and the Sum Total of Human Knowledge

  5. The Code Safety Problem

  6. Is this safe to execute? CPU Code Safety Code Trusted Host

  7. Trust is based on personal authority, not program properties Scaling problems? CPU Approach 1Trust the Code Producer Code sig PK1 PK2 PK1 PK2 Trusted 3rd Party Trusted Host

  8. Expensive Limited in expressive power (Why?) CPU Approach 2Baby-sit the Program Code Execution monitor E.g., Software Fault Isolation [Wahbe & Lucco], Inline Reference Monitors [Schneider] Trusted Host

  9. Limited in expressive power Expensive and/or big CPU Approach 3Java Code Verifier Interp/ JIT Trusted Host

  10. Theorem Prover Flexible and powerful. CPU Approach 4Formal Verification Code But really really really hard and must be correct. Trusted Host

  11. CPU A Key Idea: Explicit Proofs Code Certifying Prover Proof Checker Proof Trusted Host

  12. No longer need to trust this component. CPU A Key Idea: Explicit Proofs Code Certifying Prover Proof Checker Proof

  13. Reasonable in size (0-10%). No longer need to trust this component. Simple, small (<52KB), and fast. CPU Proof-Carrying Code Code Certifying Prover Proof Checker Proof

  14. But... • ...How to generate the proofs? • Proving theorems about real programs is hard. • Most useful safety properties of low-level programs are undecidable. • Theorem-proving systems are unfamiliar to programmers and hard to use even for experts.

  15. The Role ofProgramming Languages • Civilized programming languages can provide “safety for free”. • Well-formed/well-typed  safe. • Idea: Arrange for the compiler to “explain” why the target code it generates preserves the safety properties of the source program.

  16. Certifying Compilers[Necula & Lee, PLDI’98] • Intuition: • Compiler “knows” why each translation step is semantics-preserving. • So, have it generate a proof that safety is preserved. • “Small theorems about big programs.” • Don’t try to verify the whole compiler, but only each output it generates.

  17. Object code Source code Proof Looks and smells like a compiler. CPU % spjc foo.java bar.class baz.c -ljdk1.2.2 Automation viaCertifying Compilation Certifying Prover Certifying Compiler Proof Checker

  18. Overview of the Necula/Lee Approach to PCC

  19. High-Level Architecture Code Verification condition generator Checker Explanation Agent Safety policy Host

  20. Reference Interpreters • A reference interpreter (RI) is a standard interpreter extended with instrumentation to check the safety of each instruction before it is executed, and abort execution if anything unsafe is about to happen. • In other words, an RI is capable onlyof safe execution.

  21. Reference Interpreterscont’d • The reference interpreter is never actually implemented. • The point will be to prove (by using the proof rules given in the safety policy) that execution of the code on the RI never aborts, and thus execution on the real hardware will be identical to execution on the RI.

  22. Sample Reference Interpreter

  23. High-Level Architecture Code Verification condition generator Checker Explanation Agent Safety policy Host

  24. The Safety Policy • The RI can be viewed as defining a safety policy • RI language is a restriction of x86 assembly language • Must prove that a given program always makes progress on the RI • We introduce verification conditions (VCs), whose truth implies that the corresponding instruction has a defined execution on the RI.

  25. Verification Conditions • The point of the verification conditions, then, is to provide such progress theorems for each instruction in the program. • In other words, a VC’s validity says that the corresponding instruction has a defined execution in the s86 operational semantics.

  26. The VCGen • The verification condition generator (VCGen) examines each instruction. • It essentially encodes the operational semantics of the language. • It checks some simple properties. • E.g., direct jumps go to legal addrs. • It invokes the Checker when dangerous instructions are encountered.

  27. The VCGen, cont’d • Examples of dangerous instructions: • memory operations • procedure calls • procedure returns • For each such instruction, VCGen creates a verification condition (VC). • A VC is a logical predicate whose truth implies the instruction is safe.

  28. Examples of Safety Properties • Memory safety. • Which addresses are readable / writable; when, and what values. • Type safety. • What values can be stored and used in operations. • System call safety. • Which system routines can be called and when.

  29. Examples of Safety Policiescont’d • Action sequence safety. • E.g., no network send after reading a file. • Resource usage safety. • E.g., instruction counts, stack limits, etc.

  30. What Can’t Be Enforced? • Informally: • Safety properties. Yes. • “No bad thing will happen.” • Liveness properties. Not yet. • “A good thing will eventually happen.” • Information-flow properties. ? • “Confidentiality will be preserved.”

  31. Example of type safety giving us VC validity?

  32. Example: Source Code public class Bcopy { public static void bcopy(int[] src, int[] dst) { int l = src.length; int i = 0; for(i=0; i<l; i++) { dst[i] = src[i]; } } }

  33. Example: Target Code L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx, %edx, 4), %edi movl %edi, 8(%eax, %edx, 4) incl %edx cmpl %ecx, %edx jl L7 ret L13: call __Jv_ThrowBadArrayIndex ANN_UNREACHABLE nop L6: call __Jv_ThrowNullPointer ANN_UNREACHABLE nop ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI _bcopy__6arrays5BcopyAIAI: cmpl $0, 4(%esp) je L6 movl 4(%esp), %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, 8(%esp) je L6 movl 8(%esp), %eax movl 4(%eax), %esi

  34. Cut Points • Each loop entry must be annotated as a cut point. • VCGen requires this so that checking can be performed in a single scan of the code. • As a convenience, the modified registers are also declared in the cut annotations.

  35. Example: Target Code L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx, %edx, 4), %edi movl %edi, 8(%eax, %edx, 4) incl %edx cmpl %ecx, %edx jl L7 ret L13: call __Jv_ThrowBadArrayIndex ANN_UNREACHABLE nop L6: call __Jv_ThrowNullPointer ANN_UNREACHABLE nop ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI _bcopy__6arrays5BcopyAIAI: cmpl $0, 4(%esp) je L6 movl 4(%esp), %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, 8(%esp) je L6 movl 8(%esp), %eax movl 4(%eax), %esi VCGen requires annotations in order to simplify the process.

  36. Example: Source Code public class Bcopy { public static void bcopy(int[] src, int[] dst) { int l = src.length; int i = 0; for(i=0; i<l; i++) { dst[i] = src[i]; } } }

  37. The VCGen Process (1) _bcopy__6arrays5BcopyAIAI: cmpl $0, src je L6 movl src, %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 ret L22: xorl %edx, %edx cmpl $0, dst je L6 movl dst, %eax movl 4(%eax), %esi L7: ANN_LOOP(INV = … A0 = (type src_1 (jarray jint)) A1 = (type dst_1 (jarray jint)) A2 = (type rm_1 mem) A3 = (csubneq src_1 0) ebx := src_1 ecx := (sel4 rm_1 (add src_1 4)) A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0) edx := 0 A5 = (csubneq dst_1 0) eax := dst_1 esi := (sel4 rm_1 (add dst_1 4))

  38. The VCGen Process (2) L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI, EDX, EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13 movl 8(%ebx,%edx,4), %edi movl %edi, 8(%eax,%edx,4) … A3 A5 A6 = (csubb 0 (sel4 rm_1 (add src_1 4))) edi := edi_1 edx := edx_1 rm := rm_2 A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4)) !!Verify!! (saferd4 (add src_1 (add (imul edx_1 4) 8)))

  39. The Checker (1) The checker is asked to verify that (saferd4 (add src_1 (add (imul edx_1 4) 8))) under assumptions A0 = (type src_1 (jarray jint)) A1 = (type dst_1 (jarray jint)) A2 = (type rm_1 mem) A3 = (csubneq src_1 0) A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0) A5 = (csubneq dst_1 0) A6 = (csubb 0 (sel4 rm_1 (add src_1 4))) A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4)) The checker looks in the PCC for a proof of this VC.

  40. The Checker (2) In addition to the assumptions, the proof may use axioms and proof rules defined by the host, such as szint : pf (size jint 4) rdArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} pf (type A (jarray T)) -> pf (type M mem) -> pf (nonnull A) -> pf (size T 4) -> pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (saferd4 (add A OFF)).

  41. Checker (3) A proof for (saferd4 (add src_1 (add (imul edx_1 4) 8))) in the Java specification looks like this (excerpt): (rdArray4 A0 A2 (sub0chk A3) szint (aidxi 4 (below1 A7))) This proof can be easily validated via LF type checking.

  42. a == b a := x c := x a == c a := y c := y f(a,c) VC Explosion a=b => (x=c => safef(y,c)  x<>c => safef(x,y))  a<>b => (a=x => safef(y,x)  a<>x => safef(a,y)) Exponential growth in size of the VC is possible.

  43. VC Explosion a == b (a=b => P(x,b,c,x)  a<>b => P(a,b,x,x))  (a’,c’. P(a’,b,c’,x) => a’=c’ => safef(y,c’)  a’<>c’ => safef(a’,y)) a := x c := x INV: P(a,b,c,x) a == c a := y c := y Growth can usually be controlled by careful placement of just the right “join-point” invariants. f(a,c)

  44. Stack Slots • Each procedure will want to use the stack for local storage. • This raises a serious problem because a lot of information is lost by VCGen (such as the value) when data is stored into memory. • We avoid this problem by assuming that procedures use up to 256 words of stack as registers.

  45. Other Approaches to PCC

  46. Typed Assembly Language[Morrisett, et al., ‘98] • Use modern type theory to develop a static type system for machine code. • Prove decidability of typechecking. • Prove soundness of type system. • Developing such a type system is very hard, but done only once.

  47. TAL • fact:ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho} • jgz r1, positive • mov r1,1 • ret • positive: • push r1 ; sp : int::{t1:int,sp:rho}::rho • sub r1,r1,1 • call fact[int::{r1:int,sp:rho}::rho] • imul r1,r1,r2 • pop r2 ; sp : {r1:int,sp:rho}:: ret

  48. Eliminating VCGen • We can eliminate VCGen by using the logic to encode a global invariant on states, Inv(S). • Then, the proof must show: • Inv(S0) • S:State. Inv(S) ! Inv(Step(S)) • S:State. Inv(S) ! SP(S)

  49. Foundational PCC • Appel and Felty [’00] develop a semantic model of types, starting from the foundations of mathematical logic. • This model is used to construct the global invariant. • Hamid, Shao, et al. define the global invariant to be a syntactic well-formedness condition on machine states.

  50. Temporal-logic PCC • Bernard and Lee [’02] define the global invariant via a temporal-logic specification. • A trusted generic program then interprets these specifications to extract verification conditions.

More Related