170 likes | 424 Views
Inferable Object-Oriented Typed Assembly Language. Ross Tate , Juan Chen, Chris Hawblitzel. Typed Assembly Languages. Compilers are great but they make mistakes and can introduce vulnerabilities Typed assembly language includes a proof of (memory) safety verified by a trusted proof checker
E N D
Inferable Object-Oriented Typed Assembly Language Ross Tate, Juan Chen, Chris Hawblitzel
Typed Assembly Languages • Compilers are great • but they make mistakes • and can introduce vulnerabilities • Typed assembly language • includes a proof of (memory) safety • verified by a trusted proof checker • no need to trust the compiler • Certifying compilers • generate typed assembly language • traditionally use “type-preservation” C# Certifying Compiler TAL Trusted Proof Checker
Source Program Type-Preserving Compiler annots Intermediate Representation IR1 sigs types/proofs Optimizations/Conversions Class/Function Signatures types/proofs annots Type/Proof Annotations IR2 sigs Optimizations/Conversions annots x86 sigs types/proofs Proof Checker • Burden to preserve types at each stage • Hard to adopt in existing compilers • Types/proofs increase size of executable
Source Program Traditional Compiler IR1 sigs Signature information is already preserved in traditional compilers Optimizations/Conversions IR2 sigs Optimizations/Conversions Easy to change compiler to write sig info to file sigs x86 • Requires little change • Smaller annotation size Type Inference ? Infer proof annotations Can inference be effective enough x86 sigs annots Proof Checker
Effectiveness of Type Inference • Capable of type checking all C# features except: • Exceptions and Delegates • matters of implementation, not due to theoretical limitations
Broken C# Pseudo-Assembly bool bad(a, b : List) { Could actually be an ArrayList Could actually be a LinkedList Grabs a’s vtable vt = a.vtable; Grabs a’s implementation of isEmpty mp = vt.isEmpty; Calls a’s isEmpty with b as “this” c = mp(b); return c; } a’s implementation of isEmpty may fail to work on b
Broken C# Pseudo-Assembly a and b are each instances of some (possibly different) subclass of List bool bad(a, b : List) { Traditional TAL [PLDI ‘08] More specific function signature boolbad(a, b : ∃γ≪List. Ins(γ)) { vt = a.vtable; open a as Ins(α); Pseudo-instruction for the type checker a is given type exactly Ins(α)where α≪ List α must be fresh Via signature & memory layout information mp = vt.isEmpty; vt is given type VTable(α) c = mp(b); mp is given type (∃γ≪α. Ins(γ))→bool open b as Ins(β); return c; The “this” pointer must belong to α b is given type exactly Ins(β) where β≪ List β must be fresh c = mp(pack b as ∃γ≪α. Ins(α)); } Checks that there is some γ extending α such that b has type Ins(γ) Check fails since b has type Ins(β) and β does not extend α
Broken C# Pseudo-Assembly Inferable TAL Traditional TAL [PLDI ‘08] boolbad(a, b : ∃γ≪List. Ins(γ)) { open a as Ins(α); No pack annotations vt = a.vtable; No open annotations mp = vt.isEmpty; open b as Ins(β); No loop invariants! c = mp(b); c = mp(pack b as ∃γ≪α. Ins(α)); return c; Use type inference instead }
Inference Strategy • Always open existential types as soon as possible • Use subtyping in place of pack: • Use abstract interpretation over existential types • Requires subtyping and join algorithms Given a valid substitution of variables Subsumes using open and pack θ: ∆’ → ∆ τ ≤ τ’[θ] Such that the bodies are subtypes after substitution ∃∆.τ ≤ ∃∆’.τ’ Then the existential types are subtypes Subtyping alone of bounded existential types is undecidable! • Designed a category-theoretic framework for existential types • Constructive: includes abstract algorithms for inference • Instructive: specifies type design guidelines
Type Checking with iTalX bool bad(a, b : ∃γ≪List. Ins(γ)) { Inference Strategy Immediately open a and b Signature Information α ≪ List ⇒ Ins(α) has fields: vtable : VTable(α) ⋮ ∃α, β : α≪List, β≪List. a : Ins(α) b : Ins(β) vt : VTable(α) mp : (∃γ≪α. Ins(γ)) →bool vt = a.vtable; Signature Information α ≪ List ⇒ VTable(α) has fields: ⋮ isEmpty : (⋯) → bool ⋮ mp = vt.isEmpty; c = mp(b); Type Check Ins(β) ≤∃γ≪α. Ins(γ) return c; Check Fails β does not extend α }
Expressiveness of iTalX • iTalX is capable of handling the following features: • Classes, interfaces, generics, and multiple inheritance • Dynamic dispatch and dynamic casts • Covariant arrays as classes, and array-bounds checks • By-reference parameters (ref), structs, and value types • Jump tables and complex stack manipulation • iTalX is also robust with respect to many optimizations • iTalX should be able to handle the remaining features: • Delegates and exceptions • In experiments, iTalX currently verifies 97.9% of methods
Efficiency of iTalX Inferring Assembly-Level Types is Affordable
Type Annotation Size Type annotation size is significantly reduced
Implementation Burden of TAL Type Preservation [PLDI ‘08] Assembly-Level Type Inference • 19,000 lines of code • cut across code base • 13,800 lines of code • 5,000 lines of code • modular addition to code base • 15,100 lines of code • could be separated to reduce trusted computing base Changes to an Existing Compiler (Bartok) Type Checker + Type Inference
Conclusion • Type inference at the assembly level is • expressive enough to verify C# with optimizations • flexible enough to accommodate new language features • efficient enough to use regularly during compilation • compact enough to include in executable binaries • modular enough to retrofit existing compilers with Thank You!