240 likes | 245 Views
This paper discusses the importance of maintaining proofs while optimizing code to ensure security and functionality. It explores the concept of proof-carrying code and how it can be used to guarantee safety and accessibility. The paper also presents a framework for proof compilation and discusses the applicability of this approach to various types of program optimizations.
E N D
State your reasons or how to keep proofs while optimizing code Ando Saabas, Institute of Cybernetics EXCS kick-off meeting, 18. September 2008
Outline • Background – extensible systems • Proof carrying code – general overview • Proof compilation
Code Background • Mobile code and extensible systems popular (and increasingly more so) Host Device driver Applet Loaded procedures Operating system Web browser Database server
What should the extensions guarantee? • Security is a concern • Safety properties: • Accesses only its own memory • Doesn’t leak sensitive data • Uses resources properly • Doesn’t eat up all the memory • Holds a limited number of locks • Functional properties • Provides the functionality it promises
Runtime monitoring Monitor • A monitor detects attempts to violate the safety policy and stops the execution • Relatively simple; effective for many properties • Inflexible (no guarantees on functional properties) • Computationally expensive
PKI Digital signatures • “Company X produced this software” • Simple, well established techniques • No direct connection with program semantics
Proof Carrying Code • Proof-Carrying Code is based on the idea that the codeproducer should provide some evidence that the programshe distributes is safe and/or functionally correct. • The program is shipped with a certificate that attests that it has thedesired properties. • Before running a program, the code user checksthis certificate and only runs the code if it is safe
Proof Proof Checker Proof carrying code
PCC: An analogy • The user doesn’t try to solve a problem, only check a solution
PCC vs Digital Signatures • A digital signature identifies the origin of the program • A PCC certificate identifies the meaning of the program • A digital signature is a syntactic checksum • A PCC certificate is a semantic checksum
Where would proofs come from? • For basic safety properties, they can be inferred automatically • For more complex safety and/or functional correctness properties, the code producer would use some verification environment to prove the source program correct • But programs are distributed in compiled form
PCC framework Code producer Code user Specification Source program Binary Program verification environment Compiler + Proof compiler ? Proof Proof Checker Program proof
Proof compilation • For non-optimizing compilers it is easy: proof compilation is (almost) identity • Not so if optimizations take place
f g f g 9 i i 0 1 0 0 1 0 ^ ^ ^ ^ p s p s p i i = = = = = = f f g g 9 i i i i · · ^ ^ ^ ^ : ¤ ¤ s p s c c p p c c n n = = = = : f g f g n 9 n ^ ^ ¤ s c n p c ¤ = = p s c n p c = = : Dead code elimination Precondition • while i < n • s = s + c; • p = p * c; • i++; • while i < n • s = s + c; • skip; • i++; Invariant Postcondition
i i f g f g i i i i 5 · i ^ ^ < f ^ ^ ^ g ¤ ¤ i i s c p c n 5 5 · s c p c n c = = = = = = ^ ^ ¤ s p n = = f g i 0 1 0 ^ ^ f g s p n = = = ^ ¤ s c n p c = = Constant propagation Precondition • c = 5; • while i < n • s = s + 5; • p = p * 5; • i++; • c = 5; • while i < n • s = s + c; • p = p * c; • i++; Invariant Postcondition
Proof compilation • For non-optimizing compilers it is easy: proof compilation is (almost) identity • Not so if optimizations take place • Many different optimizations, each have their own particular effect on the proof • Need a systematic approach for dealing with this
i i f g f g i i i i · 5 · ^ ^ ^ ^ ^ ¤ ¤ s c p c n s c p c n c = = = = = State your reasons • There is always a reason why certain parts of code can be modified during optimization • These reasons should be stated – recorded – in the assertions. But how do we know exactly where and what is to be recorded? • c = 5; • while i < n • s = s + c; • p = p * c; • i++; • c = 5; • while i < n • s = s + 5; • p = p * 5; • i++; We know c is always 5 in the loop Invariant
Enter type systems • Optimizations are mostly based on dataflow analyses • Dataflow analyses can be described as type systems • Type systems can have an optimization component • Type annotations can show us what needs to be stated where when transforming proofs
PCC framework Code producer Code user Specification Source program Program verification environment Compiler + Proof compiler ? Proof Proof Checker Program proof
Program optimizer Proof optimizer PCC framework Source program Type derivation Analyzer Proof Program proof
How applicable is the approach? • The approach works on all classical program optimizations • Scales to complicated, code structure changing optimizations such as partial redundancy elimination • Can be used for optimization which require bidirectional analyses – many non-trivial bytecode transformations • Can be applied to both high level and CFG based program and analysis descriptions • Implementation for Java bytecode analyses: • dead store elimination • load pop pair elimination • store load pair elimination etc
Conclusions • It is important to get the basic notions and tools right – type systems are exactly the right tool when trying to describe what optimizations are doing • They lend their hand for formal reasoning about optimizations – proving soundess, certain optimality results etc • Soundness of the optimization makes it possible to transform a program’s proof along the program guided by its analysis type derivation – record and exploit what you know, and it will come out right
Acknowledgements • I would like to thank Estonian DoctoralSchool in ICT, EITSA Tiger University Plus programme and the Estonian Association of Information Technology and Telecommunications (ITL) for their financial support.