370 likes | 384 Views
Formalising Java Safety – An overview. Pieter H. Hartel phh@ecs.soton.ac.uk 2003.10.02. 목 박숙영. Contents. Introduction Methodology Java Semantics The compiler Java extensions Small footprint devices Conclusions. Introduction 1/2. Java is a safe programming language
E N D
Formalising Java Safety – An overview Pieter H. Hartel phh@ecs.soton.ac.uk 2003.10.02.목 박숙영 HPCC Lab
Contents • Introduction • Methodology • Java Semantics • The compiler • Java extensions • Small footprint devices • Conclusions HPCC Lab
Introduction 1/2 • Java is a safe programming language • Type safe and memory safe • The two main features • Java does not offer pointer arithmetic • Java offers references to objects • Unused objects are automatically garbage collected • Java is a strongly typed language • Java performs runtime checks to avoid array index errors HPCC Lab
Introduction 2/2 • Class loader • Accepting and loading JVM programs into the Java runtime environment • Byte code verifier • Another type checker operating on the JVM byte codes. • Both do their work before execution ofo the code from a newly loaded class starts. HPCC Lab
Methodology 1/4 • Formal specifications • The semantics of Java • The semantics of the JVM language • The Java to JVM compiler • The runtime support, that is parts of the Java API, including all java.* classes. HPCC Lab
Methodology 2/4 • The methodology to build these specifications • Construct clear and concise formal specifications of the relevant components • Validate the specifications by animating them, and by stating and proving relevant properties of the components. • Refine the specifications into implementations • Create all specifications in machine-readable form HPCC Lab
Methodology 3/4 • Principal difficulties • Multi-threading, exception handling, object orientation and garbage collection • Careful consideration • Ambiguous, inconsistent, incomplete • Reference implementation is complex HPCC Lab
Methodology 4/4 • Popular assumptions • Unlimited memory • Individual storage locations can hold all primitive data types • Individual JVM program locations can hold all byte code instructions HPCC Lab
Java and JVM language features • IM: Imperative core consisting of basic data, expressions and statements • OO: Object orientation, i.e. Objects, classes, interfaces, and arrays • TY: The Java type system, or byte code verification in the JVM • CL: Class loading • EH: Exception handling • MT: Multi-threading, monitors, synchronisation • GC: Garbage collection HPCC Lab
Java Semantics • Table 1 HPCC Lab
Object Orientation • Alves-Foss and LAM[1] • denotational semantics of most of Java • detail on the various basic data types in Java • Better understanding HPCC Lab
The type system 1/2 • Based on simple sub typing • One novel feature • Java offers interfaces by way of creating multiple inheritance • Drossopoulou and Eisenbach[24] • Static semantics and dynamic semantics of a relatively small subset of Java • Drossopoulou et al[23] • Extend their subset to include exception handling • Syme[55] • DECLARE system, gives proofs • To uncover 40errors made during the translation • Found two non-trivial errors in the hand written proofs of Drossopoulou and Eisenbach HPCC Lab
The type system 2/2 • Nipkow and von Oheimb[45] • Prove type soundness of a similar subset to Drossopoulou et al. • Drossopoulou et al. • Use Isabelle/HOL to machine-check the proofs from the outset • Higher degree of confidence in the correctness of the specifications and the proofs • Not able to validate the specifications • Due to the lack of support for generating executable semantics[58] • Glesner and Zimmermann[26] • Specify the type system for a small fragment of Java HPCC Lab
Class Loader • Wragg et al[62] • Offer a model of class loading for a relatively small subset of Java to study one of Java’s more experimental features(binary compatibility) Multi-threading • Borger and Schulte[10] and Cenciarelli et al[13] • Multi-threading at the Java level • The study of the issues left open by the official SUN documentation HPCC Lab
The compiler • Diehl[21] • Compilation schemes for a subset of the Java that excludes exceptioin handling, multi-threading and garbage collection to the corresponding subset of the JVM • Operational semantics of this JVM subset • Rose[50] • Natural semantics of a subset of Java • Static type systems for both(Java, JVM) • A specification of the compiler for the subsets HPCC Lab
The Abstract State Machine approach 1/3 • Borger and Shulte • Working on formal specifications of Java, JVM, Compiler • Based on the Abstract State Machine formalism • Full semantic account: in Gurevich[29] • Specify a modular semantics of a subset of the JVM[11], a subset of Java[10] • Modular approach • The two subsets do not entirely coincide HPCC Lab
The Abstract State Machine approach 2/3 • [7] • Reducing the subsets of Java and the JVM to omit Multi-threading, class loading and arrays. • Main result • Informal theorem stating the correctness of the compiler • Two papers revisit exception handling and object initialisation • [8]: On problems with the initialisation of objects • [9]: exception handling mechanism of java, the JVM, and the Compiler • Main result: Formulation of the correctness of compiling exception handling, with a full proof HPCC Lab
The Abstract State Machine approach 3/3 • Stark[53] • The specification of Java and the JVM from Borger and Schulte[11,10] • Presents a compiler from the imperative core of Java • Gives a correctness proof of the compiler • A forthcoming book[6] • More complete specification of Java, the JVM, the compiler, the byte code • [9] • Mechanical checking of the specification • Wallace[60] • Includes Multi-threading, exception handling • Excludes class loading and garbage collection HPCC Lab
Java extensions • The safety of Java programs • By using program verification techniques • Fewer design and implementation problems • Smart cards HPCC Lab
Model checking 1/3 • Demartini et al[18], Havelund et al[31] • How core features of Java can be mapped onto the Promela language of the SPIN model checker. • multi-threading and objects.(Havelund et al model exceptions.) • the objects using Promela’s arrays(one array element per instance of the class) • The resulting models quickly grow too large to model check effectively • only check for safety properties(assertions, deadlock) • do not provide support for the checking of liveness properties HPCC Lab
Model checking 2/3 • One of the most useful features of the SPIN model checker • Its ability to display scenarios leading to problems(deadlock) • Demartini et al • To relate these scenarios back to the original Java sources • More user friendly than that of Havelund et al. HPCC Lab
Model checking 3/3 • Jensen et al[33] • Use model checking to verify properties of Java programs, more abstract approach • Static analysis techniques • To reduce a Java program to a control flow graph • Method calls, method returns, assertions • Defines the state transitions of the abstract Java program • Example[38] • How the system can be used to model Java’s sandbox • The stack inspection introduced by Java 2 HPCC Lab
Theorem proving 1/3 • Detlefs et al, Modula 3[20], Java[52] • Offers by requiring the programmer to annotate programs with pre- and post-conditions. • The compiler is able to generate and prove the verification conditions. • The system of Detlefs et al • does not require the programmer to annotate programs with loop invariants and variants • derives loop invariants automatically • Assume that loops are executed at most once • Powerful • The type checker < the system < full verification HPCC Lab
Theorem proving 2/3 • The LOOP project of Jacobs et al • Full verification of Java programs • Use a denotational semantics based tool to translate Java into the higher order logic of widely used theorem provers(PVS[32], Isabelle/HOL[57]) • .. • Properties • Termination of a method • In-variants on the fields of a class HPCC Lab
Theorem proving 3/3 • Poetzsch-Heffter and Muller[47] • An operational/axiomatic semantics of a subset set of Java • prove the soundness of the axiomatic semantics with respect to the operational semantics. • embedded in HOL • Mechanical checking of the soundness proof would be feasible. • Moore[39] • A new version of a small subset of Cohen’s specification[15] of the JVM • How the ACL2 theorem prover is capable HPCC Lab
Controlling type casts • Java’s lack of polymorphism • Requires programmers to insert type casts in their programs • Example • When storing an object, MyObject • One must remember to cast the raw object back into the user class MyObject when retrieving the information • Erroneous type casts: cause unexpected runtime exceptions • Pizza[46] and Generic Java[12] • Automatically inserting the required type casts. • Generic Java • No cast inserted by the compiler will fail HPCC Lab
Controlling execution time • Java safety would be able to guarantee that computations terminate(within certain bounds) • The denial of service attack would be prevented • Execution time is one of the most difficult to control resources. HPCC Lab
Code certification 1/2 • Necula and Lee[40]: proof carrying code(PCC) • Automatic verification technique(assembly level programs) • The producer • expresses a safety property in terms of pre and post conditions on the program • annotates the program, with loop invariants etc • generates a proof of the safety property(by hand/using a mechanical proof assistant) • The consumer • receives the code and the proof • mechanically checks that the proof is consistent with the program • The program satisfies the safety property • Does not need to trust the producer • relies only on a small trusted infrastructure(type checker) HPCC Lab
Code certification 2/2 • The problems of the PCC approaches • The size of a proofs: exponential in the size of the program[42] • The amount of redundancy • Necula and Lee[41] • Reduce a proof of size n to a proof of size √n by avoiding some redundancy • Program verification requires special skills • To formulate properties • To discover appropriate loop invariants • To drive mechanical theorem provers etc. • It is essential that tools are automatic, or at least require as little programmer intervention as possible HPCC Lab
Small footprint devices • Small footprint devices • Mobile phones, PDAs, K Virtual Machine: 128KB of RAM • Smart card • A few hundred bytes of RAM & a dozen or so KB of EEPROM • Java-Card VM(JCVM) • 3 disadvantages • The full potential and flexibility of client server software development cannot be realised • Java applets running on the smallest embedded controllers cannot be verified appropriately before they are run • The freedom of code migration is restricted • Based on the Split VM concept • Pushes part of the byte code verification from the loading to the compilation/linking phase. • JVM byte code ☞ JCVM format • Byte code verification, optimises, prepares the code for loading into the device. HPCC Lab
Byte code compression • Clausen et al[14] • Retain JVM byte codes • Propose to compress them for the benefit of embedded systems • The compression technique • Commonly occuring sequences of instructions • A new ‘macro’ instruction • 30% loading time increase ☞ 30% space save up HPCC Lab
Class file conversion 1/3 • Hartel et al[30]: the Java Secure Processor(JSP) • Provide a complete specification of an early version of the JCMV • Excludes multi-threading, garbage collection and exception handling • Validated using the letos tool • Methodological point[56] • Earlier JSP • the full JVM ☞ cutting back unwanted features. • Newer KVM • Scratch ☞ adding features as required. • The developers of the picoPERC version of the JVM [44] • offer a core VM(64KB) • provide tools to add further functionality to the core VM HPCC Lab
Class file conversion 2/3 • Lanet and Requet[35] • B-method • To study one particular aspect of the conversion from JVM to JCVM code • Their results include • A specification of the constraints imposed by the byte code verifier for a small subset of the JVM • A specification of the semantics of this subset of the JVM byte codes • A specification of the semantics of the corresponding subset of the JCVM byte codes • A proof that the specification of the JCVM subset is a data refinement of the JVM subset HPCC Lab
Class file conversion 3/3 • Denney and Jensen[19] • Complementary to that studied by Lanet and Requet. • Lanet and Requet • The conversion of JVM class files to JCVM class files by a ‘tokenisation’ • Replaces names in the class files • Reducing the size of the class files • Speeding up the loading process • Use the Coq theorem prover to mechanically check their proofs. • Use an elegant method to parameterise their operational semantics over name resolution HPCC Lab
Byte code verification revisited 1/2 • Split VM concept • Off-line verification: Signing the results digitally(signature) • Posegga and Vogt[49,48] • To use a model checker(SMV) to perform off-line byte code verification for smart cards. • Posegga et al[27] • Propose to implement a tiny proof checker on a smart card. HPCC Lab
Byte code verification revisited 2/2 • Rose and Rose[51] • Use Necula and Lee’s proof carrying code(PCC) method to ‘split’ the byte code verifier. • The verification • To reconstruct the types associated with all local variables and stack locations of JVM code • The certification • To check based on the reconstructed types, that each instruction is correctly typed. • Advantage • The certification process is simple • Only the certification needs to be trusted, not the verification HPCC Lab
Conclusions • On modelling garbage collection, and the Java API. • On building more appropriate theories for programming language semantics modelling. • On simplifying and modularising the individual components of Java implementations. • On reducing the size of the trusted computing base, so that flaws are less likely to compromise the security of the system as a whole. • On considering formal specification, validation and provably correct implementation as a whole, rather than in separation. • On presenting clear an concise formalisations of systems, which are accessible to the designers and implementors of these systems. • On using machine machine-readable specifications. HPCC Lab