270 likes | 446 Views
Untrustworthy Programming Languages. Andrew Kennedy, MSR Cambridge. Do you trust your programming language?. Modern programming platforms promise security:
E N D
Untrustworthy Programming Languages Andrew Kennedy, MSR Cambridge
Do you trust your programming language? • Modern programming platforms promise security: • The Java security model is based on a customizable "sandbox" in which Java software programs can run safely, without potential risk to systems or users (java.sun.com/security) • The .NET Common Language Runtime implements its own secure execution model that is independent of the host platform (Don Box, MSDN magazine) • Most articles emphasise type-safety (=> memory safety) of the JVM or CLR • And of course, special-purpose mechanisms such as Code Access Security (stack-walking), permissions, crypto, etc • But that’s not the whole story…
The way it was • In the past: • programming language abstractions made languages “high-level” i.e. far from the raw metal of the machine • good software engineering • protected programmers from themselves & others • If the language contained holes, it was “just” a programming problem • In any case, nothing was enforced “underneath” except at coarse boundaries (machine, system/user, process)
But now... • The programming model is part of the security model • in particular, its type system • but also, other aspects… • Programmers will assume that abstractions are enforced underneath... • ...and use them to write secure code.
Eiffel, 1989 Cook, W.R. (1989) - A Proposal for Making Eiffel Type-Safe, in Proceedings of ECOOP'89. S. Cook (ed.), pp. 57-70. Cambridge University Press. Betrand Meyer, on unsoundness of Eiffel: “Eiffel users universally report that they almost never run into such problems in real software development.”
Secure programming platforms Java source C# C++ Visual Basic C++ compiler VB compiler Java compiler C# compiler JVML (bytecodes) CIL CIL CIL Executed on Executed on JVM(Java Virtual Machine) .NET CLR(Common Language Runtime)
Type safety • Ensures • data safety: can access memory only through typed objects • code safety: can access components only according to their interface • Isolates software processes (“Application Domains” in .NET) • used for downloadable plug-ins for UI in next version of Windows • Importance of type safety is now widely appreciated • Microsoft would issue an immediate “critical update” if a type safety bug was discovered [Insert war stories here]
Type loophole => anything goes • Exploit a type loophole to execute arbitrary code. Here’s a recipe. • Define a delegate type D, create a delegate object off an empty methoddelegate void D();public static void DoNothing() { }D d = new D(DoNothing); • Define a SpoofD class with int field spoofing the (internal) function pointer field of the delegate typeclass SpoofD { public int fptr; ... } • Now pretend that the delegate object has type SpoofD (via type loophole)SpoofD sd = ...loophole magic...(d); • Set the spoof function pointer field to the address of your malicious codesd.fptr = my_bad_code; • Invoke the delegatesd();
Beyond type safety • How do programmers reason about security properties of their code? Or about their code at all? We might hope that: • A C# programmer can reason about code armed only with the C# language spec and specs for libraries used by the code • Unfortunately, it seems that a C# programmer also needs • Some understanding of how C# is translated into IL • Some understanding of the behaviour of IL • Some understanding of parts of the standard library not mentioned in the language spec or used by the program
Example 1: “Privacy through override” • In C# (and Java), overridden methods cannot be invoked directly except by the overriding method • This property has been used by programmers for security purposes:class InsecureWidget { // No checking of argument virtual void Put(string s); …}class SecureWidget : InsecureWidget { // Validate argument and pass on override void Put(string s) { Validate(s); base.Put(s); }}…SecureWidget sw = new SecureWidget();// We can’t avoid validation of arguments to Put, can we? // Oh, yes we can! Direct call on superclassldloc swldstr “Invalid string”call void InsecureWidget::Put(string)
Analysis • What went wrong? • In C#, overridden methods can only be invoked through “base” calls • In IL, they can be called directly • So there are programs in IL that can provoke behaviour not possible from C# • What is a good way to characterize this? • Translation from C# to IL fails to be fully abstract • See “Protection in Programming Language Translation”, Abadi, 1998 • How can we fix it? • Not easily: IL was designed for multiple languages, with conflicting goals
An ideal: full abstraction • Ensure that all abstractions of the programming language are enforced by the runtime • programmers don’t have to know what’s underneath • if they understand the programming language, they understand the platform programming model • Ensure that translation from C# to IL is fully abstract C# program Properties that hold here... ...also hold here IL program
Full abstraction • Two programs are equivalent if they have the same behaviour in all contexts of the language e.g. • A translation is “fully abstract” if it respects equivalence • For us: • the “translation” is from source language (C# etc) to MSIL • if there exist contexts (e.g. other code) in MSIL that can distinguish equivalent source programs, then the translation fails to be fully abstract class Secret { private int f; public Secret(int fv) { f = fv; } public Set(int fv) { f = fv; }} class Secret { public Secret(int fv) { } public Set(int fv) { }} ≈
Full abstraction for Java • Translation from Java to JVML is not quite fully abstract (Abadi, 1998) • At least one failure: access modifiers in inner classes • a late addition to the language • not directly supported by the JVM • compiled by translation => impractical to make fully-abstract without changing the JVM
Full abstraction for C#? • A number of failures • Excuse: multiple languages target the CLR, with different goals • The JVM was designed for a single language: Java. (Almost) Full- abstraction was probably an accident; though in retrospect it’s a good thing. • For C#/CLR, we can catalogue failures of full abstraction and propose fixes • either: change the translation from C# to IL • or: reduce expressivity of IL (fewer IL contexts) • or: increase the expressivity of C# (more C# contexts) • At least: document the failures, educate programmers, provide tools to spot insecure programming patterns
Example 2Encapsulation of object state • Programmer expectation: instances of types whose API ensures immutability are immutable. • Ex: String, DateTime, Int32 • Boxing shouldn’t make any difference, should it? // A dictionary keyed on stringsclass StringDict { private Hashtable dict; public object Get(string s) { return dict[s]; } internal void Set(string s, object o) { … }…}static StringDict personalData;// In a module far away…// We cannot update from here object salary = personalData.Get(“Salary”); • // Oh, yes we can! Just get pointer to interiorldloc salaryunbox int32stind.i4 1000000
Example 2Encapsulation of object state • An equivalence that is not preserved: • Fix? • In CLR type system: disallow update after unboxing public static int Foo(int x) { object y = (object) x; Bar(y); return x;} public static int Foo(int x) { object y = (object) x; Bar(y); return (int) y; } ≈
Example 3thisis valid object instance? • Instance methods are always invoked on a valid instance, surely?class Foo { // Instance registered for privileged action private static Foo registered = null; // Only called from this module (internal access) internal void Register() { registered = this }; public void Bar() { if (this == registered) { // Perform privileged action } }}// We can’t execute privileged action from another module • // Oh, yes we can! Just call-direct-with-nullldnullcall void Foo::Bar()
Example 3 thisis valid object instance? • An equivalence that is not preserved: • Fix? • In C# compiler: explicit check-for-null at start of method • In CLR: check-for-null at call-site (as with virtual call) class C { public bool Foo() { return true; } …} class C { public bool Foo() { return this != null; } …} ≈
Example 4 Exceptions are instances of System.Exception? try { // perform some action, to completion } catch (Exception e) { // undo action whenever an exception was thrown in try-block }// Action either ran to completion, or was fully undone // Not necessarily! From IL, can throw any objectnewobj instance void System.Object::.ctor()throw
Example 5Booleans are two-valued? void Foo(bool b){ bool c = !b; if (!c != b) { Console.WriteLine(“This cannot happen”); }} // Oh yes it can! ldc.i4 2call void Foo(bool)
Example 5 Booleans are two-valued? • An equivalence that is not preserved: • Fix? • Change C# compilation of == and != for bool so that it cares only about zero/non-zero-ness static bool Foo(bool x, bool y) { return (x == false) == (y == false);} static bool Foo(bool x, bool y) { return x==y;} ≈
Weak abstractions • Some abstractions aren’t broken; they’re just a bit weak • arrays are always mutable • developers forget this and define “readonly” properties with array types • run-time types break “privacy by subsumption” • solution to array problem would be to return array as an IEnumerable (a read-only enumerator) • but run-time types let programmer “cast” back to the array • Other abstractions are broken not by IL but by library classes • e.g. delegates (closures) would “encapsulate” code & object state if it weren’t for System.Delegate.Target and System.Delegate.Method methods.
Why bother? • Even if the translation from C# to IL were fully abstract, reasoning about C# programs would still be hard. • Programmers make mistakes in writing secure code • Tools for automating reasoning about programs are still in their infancy • There are many other pitfalls in the language • So why bother about full abstraction? • Because it’s a great starting point: • The ability to reason about C# programs “in C#” is hugely simplifying • Even better: if we could cut down to a subset of C# that suffices
Formalize? • Proofs of full abstraction are hard • We don’t have a complete formal model of C# • We don’t have a complete formal model of IL • So what to do? • Optimist: even if we can’t formalize, we can identify failures, and fix them all • Pessimist: we can never be sure that we have full abstraction. Instead, focus on certain patterns, prove that these are watertight. Example: • prove that integers are safe! • prove that private fields don’t leak
Conclusions • The programming model is a vital part of the security story for .NET and Java • Programmers need to know what they can trust • “Full abstraction” is the ideal • My choice would be to fix the holes we know about • Might be hard to do • If we can’t or won’t, we should educate developers • Type safety is now taken for granted as a necessity • In the future, full abstraction also?