Typed Assembly Languages and Security Automatons

Typed Assembly Languages and Security Automatons Ben Watson The George Washington University CS 297 Security and Programming Languages June 2, 2005

Problems with Assembly • Assembly is completely data-agnostic: it doesn’t care, doesn’t want to care what type of data you’re moving around • If C is the rope to hang yourself with, then assembly is parking your car on the train tracks. At rush hour. With a full tank of gas. And you can’t leave your car. Good luck.

TALx86 • Typed Assembly Language for Intel x86 processor • Implements subset of Intel IA32 instruction set • Designed to be “realistic”, as in compilable and usable on a real computer by real people. • Allow compilation from multiple high-level languages

TALx86 • Designed to overcome specific limitations in Java bytecode • JVML has semantic errors that could have been discovered with a formal model • Difficult to compile other high-level languages to bytecode • Difficult to extend Java itself due to bytecode limitations (impossible to correctly compile Scheme to bytecode, for example) • Bytecode interpretation is slow, thus JIT is often used—but this is an afterthought, not fundamental

Alternate solution(because I like .Net) • Many of these problems that TALx86 was designed to address were also addressed in .Net • Formal design models were used • CLR designates minimal feature set for supported languages • Scheme compilations is possible, along with dozens of other languages • JITted code part of the design of runtime and language • MSIL, like bytecode, is a typed intermediate assembly language

TALx86 Features • Most basic assembly features • Stack-allocation • Type-checking • Arrays and unions • Recursive types (i.e., linked lists) • Annotations

Popcorn • Standard C isn’t strongly typed and thus can’t be represented as TALx86 • A strongly-typed C-based language • Support for polymorphism, abstract types, tagged unions, and exceptions • Won’t discuss too much

TALx86 Compilation Process • Main.tal – assembly listing • Main_i.tali – import interfaces • Main_e.tale – export interfaces

TALx86 Annotations • Import/export interfaces (for type-checking separate object files) • Type constructors (how to declare new types) • Preconditions on code labels (register must have type X before code entered) • Types for static data • Type coercions (converting one type to another) • Macros (type checker can treat entire section as single action)

int i = n+1; int s = 0; While (--i > 0) s+=i; mov eax, ecx ;i=n inc eax ;++i mov ebx, 0 ;s=0 jmp test body: {eax: B4, ebx: B4} add ebx, eax ;s+=i test: {eax: B4, ebx: B4} dec eax ;--i cmp eax, 0 ;i>0 jg body TALx86: Register Preconditions • B4: 4-byte integer

TALx86: Supporting C/Win32 calling conventions • Predicate to describe state of stack • {esp: sptr {eax: B4}::B4::se} • The stack must contain a pointer to a 4-byte int, an int (function argument), then nothing else • Can be generalized to any stack “shape” • :Ts {esp: sptr {eax: B4, esp: sptr B4::}::B4::}

TALx86 • A type verifier checks the validity of each instruction in each label’s block • The type checker is programmed for the semantics of TALx86/IA32 instructions • Additional rules for • Memory allocation • But not deallocation! (hence the use of a garbage collector) • Arrays • Lists, structs

TALx86: Optimizations • Abbreviations (to take less space in source) • Remove repetitions (i.e., a stack and its return address have the same type) • Forward branch targets need no precondition

TALx86: A foundation for security • TALx86 gives you assurance when you compile your code that type safety is enforced • It does not add security per se • For that, let’s move on to…

Type Systems for Expressive Security Policies • Assumes a strongly typed language (such as TALx86) • Uses security automata • Can always be enforced by runtime checks • Can rewrite programs to obey policy

Security Automaton

Enforcing Security Automatons • Code Instrumentation • Auxiliary code (usually for monitoring purposes) • C/C++ usually has to be recompiled • .Net and Java don’t always (runtime environment + reflection)

Enforcing Security Automatons Formal: Let next = send(current) If next = bad then halt else send()

Enforcing Security Automatons Example: Before: … Send(); …

Enforcing Security Automatons After: … State nextState = GetNextState(currentState); if (nextState == badState) { thrownew SecurityException(); } Send(); …

Security Instrumentation Systems • SASI • Security Automata SFI Implementation • Software Fault Isolation • An implementation of security automatons and instrumentation • Slows down native code, but only slightly • Reimplemented Java security manager—at least as efficient

Enforcing Security Automatons • But what if someone hacks the state-checking code? • Prove that the state-checking code is correct • Augment the type system to include value states, similar to TALx86 • Associate predicates with each enforceable statement • These are decidable at compile-time

Enforcing Security Automatons • Optimizations • Some predicates are always true, given a certain state – the check can be removed • Perform control-flow analysis that propagates proven predicates throughout program • Possibly proving further predicates that can be removed

Benefits of Previous Techniques • Prevents things like buffer overflow attacks • More confidence in machine-level code • Stronger high-level  low-level mapping

References • Morrisett, et al. TALx86: A realistic typed assembly language, ACM SIGPLAN Workshop on Compiler Support for System Software, pages 25-35, Atlanta, GA, USA, May 1999 • Walker, David. A type system for expressive security properties. Twenty-Seventh ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 254-267, Boston, MA, USA, January 2000

Typed Assembly Languages and Security Automatons