1 / 23

Language-Based Safety Mechanisms

Language-Based Safety Mechanisms. Stanford University CS 444A, Autumn 99 Software Development for Critical Applications Armando Fox & David Dill {fox,dill}@cs.stanford.edu. Concepts Overview & Outline. Static approaches “Safe by design” (limiting the language)

Download Presentation

Language-Based Safety Mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Language-Based Safety Mechanisms Stanford University CS 444A, Autumn 99Software Development for Critical ApplicationsArmando Fox & David Dill{fox,dill}@cs.stanford.edu

  2. Concepts Overview & Outline • Static approaches • “Safe by design” (limiting the language) • Static analysis/type-safe languages • Dynamic approaches • interpreters and sandboxes • Dynamic dataflow analysis • A few examples (and problems) • Java, the Exokernel, VMware, SFI, Janus, Interface Compilation • As usual…each bullet is the subject of volumes of papers…this is just an introduction to the landscape

  3. Contrast With David’s “Req Spec” • RS is about verifying a program (or FSM) in the abstract • SFI is about securing them in practice • The two are complementary • Ex: “Transitions in FSM cover all possibilities” • What is “all”, really? • Recall: dreaming up desired emergent properties • Compare: Intel P6 bus protocol verification vs. implementation validation

  4. What Is “Safety” in this context? • Primary emphasis: prevent buggy/malicious app from doing harm to others • Don’t interfere with other apps directly (read/write their data or files) • Don’t interfere with other apps indirectly (hog OS resources so other apps are denied service) • Don’t crash or corrupt the OS • particularly important, since OS usually is the “trusted arbiter” of limited resources • Non-goal: stability of the isolated app.

  5. Techniques • Two basic families of techniques: 1. Limit things at runtime 2. Limit things at compile time • Many schemes use a combination of both • Runtime schemes typically rely on some OS and/or hardware support

  6. Background: “The Thin Red Line” Userpage Userpage Userpage Userpage Userpage Userpage Kernel pages • Separates untrusted user space(s) from trusted kernel space • Kernel manages hardware, shared resources, … • If you can bend the kernel to your will, you can do serious damage • Typical implementation: hardware VM support • Each user process has its own page tables (managed by the kernel) • Certain addresses mapped to kernel pages Usercode Programming model Kernelcode

  7. Call Gates • Call gates (or call descriptors, or traps, or…) • Controlled breach in the thin red line • Typically involve an address space change, which relies on VM; so they are slow and expensive • Implementation often uses exception-handling capability of processor User code Kernel code

  8. Background: Virtual Machines • In practice, a VM provides a combination of a language execution environment and a “pseudo-OS” runtime system • “guest” VM may virtualize hardware resources differently from “host” OS • Safety is often not a primary goal of a VM • The “guest” and “host” OS’s may be the same or different with respect to… • Machine language/programmer-visible architecture • Virtualization of resources • Common flavor to various approaches: Control access to “unsafe” language/VM features

  9. VM Examples • Java: artificial-machine-in-a-real-machine • Provides a language, a runtime, and OS-like abstractions (network, filesystems, etc.) • Centralized Java Security Manager enforces security policies • For the most part, runs in user mode • VMware: virtualize any x86 OS inside any other (well, almost) • Every VM “sees” x86 protected-mode environment • Within a VM, policies enforced by guest OS • Across VM’s, virtualized hardware is isolated • User must grant a certain level of trust to VMware host program

  10. What Can You Do With This? • Limit what the language can express • “Unsafe” operations are defined out of existence • “Never put off till runtime what you can do at compile time” • Limit what can be done at runtime • Perhaps in combination with language limiting • Each approach has pros and cons

  11. Static Analysis, Type-Safe Languages • Goal: To limit the damage a program can do, limit what can be expressed in the source language • Assumes binaries are tamper-evident • Assumes only trusted tools used to build binaries • Assumes trusted tools are working correctly! • Language features/limitations may allow you to prove some invariants • Example: Backward branching disallowed  finite-length programs finish in finite time • Example: Pointers disallowed  dangling pointer dereferences vanish • Contrast: SFI or inserting guard code

  12. Example: Spin and Modula-3 • SPIN (Bershad et al., early 90’s): a user-extensible microkernel • Extension language: Modula-3, a type-safe, object-oriented language • Why type safety? • Why object oriented? • The extension checker and compiler

  13. Limiting the Language • Goal: To limit the damage a program can do, limit what can be expressed in the source language • Assumes binaries are tamper-evident • Assumes only trusted tools used to build binaries • Assumes trusted tools are working correctly! • Language features/limitations may allow you to prove some invariants • Example: Backward branching disallowed  finite-length programs finish in finite time • Example: Pointers disallowed  dangling pointer dereferences vanish • “Never put off till runtime what you can do at compile time”

  14. Pros & Cons of Static Analysis - Requires that code be written in that specific language • Sometimes it’s actually desirable to have a simpler language! (e.g. Exokernel generalized packet filter) • Other times languages may be too limited or awkward • May also rely on integrity of tool chain - Languages with rich type systems and class hierarchies confound this approach • Checking virtual function calls • Casting between “safe” types (e.g. int to enum)

  15. Static analysis, cont’d. - Relies on integrity of interpreter or binaries • What if the Java guys forgot some of the security checks? • VM interpreter may need semi-privileged access to get at the “real” resources controlled by the host OS • Or at least, OS must verify signed code segments (ActiveX does this) + May allow strong formal proofs of program safety • Usually done by showing that a particular high-level construct can never produce “unsafe” low-level code • Can prove from the source code, if transformations are “correctness-preserving” (or “semantics preserving”)

  16. At Runtime: Classic SFI and Janus • SFI: “If program stays in its sandbox, it can’t damage other programs.” • Dangerous operations/references surrounded by interpolated “guard code” • Dangerous references can also be “pinned” to sandbox by overwriting upper address bits • Note, this breaks program correctness! But focus of SFI is preventing harm to others, not to oneself • Janus: “If program can’t make system calls, it can’t damage the OS [and therefore other programs]. • Some programs break because they don’t check system call results

  17. Pros & cons of runtime approaches + Use high-confidence machine-level mechanisms • Based on hardware-level mechanisms, e.g. VM, traps • In practice, hardware implementation errors for these are extremely rare (why?) + Can be used with arbitrary “legacy” code - No onus on programmer to make potential error conditions explicit (e.g. assertions) • So runtime has no idea what to do to “recover” - Doesn’t guarantee correct behavior--only safety to others

  18. Dynamic Dataflow Analysis • Potentially unsafe operations must always be denied, to be conservative • If done statically, renders code impotent • Idea: quarantine the data that may be “contaminated” by user (taintperl works this way) print STDERR “Enter file name:”;$x=<STDIN>; # $x is tainted (user input)…more code… $z=“/tmp/safe_file.txt”; # $z is clean$y=“$sysdir/$x”; # $y is taintedsystem(“cat $y”); # disallowed!system(“cat $z”); # OK

  19. Interface Compilation Problem: interfaces are a syntactic abstraction that usually carry no semantics • Semantics might be useful for… • Special-case optimizations (e.g. file I/O, specialization by call site) • Safety of called proc, or error handling in case of failure • Is the interface too narrow? • Semantic type info may be lost (Unix) • Semantic properties such as “liveness” are not preserved across the interface (hidden state) - example to follow

  20. Exploiting Semantics • Example 1: File I/O fd = open(filename);/* …do some file operations … */close(fd);/* …more code… */read(fd,buf,4096); /* certain to fail! */ • Example 2: type impoverishment read(int fd, void *buf, size_t n); • What if buf is unaligned or not big enough? • No way to tell from call syntax

  21. Interface Compilation With MAGIK • Provides abstractions for dealing with interfaces • Iterators over the function calls • Accessors for the data structures manipulated by each call: what type? Compile-time constant? Access to internal fields of structure? Etc. • Allows programmer to write C-like code “extensions” using these functions and accessors • Original source and extensions are compiled together into common intermediate form • Intermediate form can be optimized using traditional methods before machine targeting

  22. IC as an Orthogonal Mechanism • Can retrofit existing “legacy” code (provided source is available) • Admits of incremental improvements • Safety concerns/development can be kept separate from mainline logic for maintainability • Some cool implemented examples • Type-aware I/O for C • Safe signal handling (prevents calling non-reentrant library functions inside a signal handler) • Common thread: uses semantic information that cannot be extracted from source alone • Compare with “emergent properties” in req. spec.

  23. Lessons? Anyone? • Limits of virtual machines and static analysis • Assumes tools are trustworthy, from a security standpoint • But…buggy == untrustworthy • End-to-end argument suggests falling back on runtime SFI?

More Related