490 likes | 621 Views
Nora Sovarel and Joel Winstead 21 September 2004. Monoculture and Diversity. What is monoculture?. “the cultivation or growth of a single crop or organism especially on agricultural or forest land” Merriam-Webster Online. Monoculture in Biology. The Irish Potato Famine, 1845-1850
E N D
Nora Sovarel and Joel Winstead 21 September 2004 Monoculture and Diversity Monoculture and Diversity
What is monoculture? “the cultivation or growth of a single crop or organism especially on agricultural or forest land” Merriam-Webster Online Monoculture and Diversity
Monoculture in Biology The Irish Potato Famine, 1845-1850 • About half of Ireland’s population depended on the potato crop • The fungus Phytophthora infestans appeared in Ireland in 1845 • Every potato farm in Ireland was vulnerable • Consequences for Ireland: • 1 million people died • 1-2 million emigrated Monoculture and Diversity
What about computing? Most statistics agree that Microsoft has at least 90% of the OS market. For example: thecounter.com: • Win XP 56% • Win 98 20% • Win 2000 15% • Win NT 1% • Win 95 + Win 3x less than 1% • http://www.thecounter.com/stats/2004/August/os.php Monoculture and Diversity
Monocultures in Computing • Operating Systems – 90% Microsoft • Browsers – IE, Opera, Netscape • Web Servers – Apache, IIS • Routers – 85% Cisco • Processors – x86, Sparc Monoculture and Diversity
Why are we in this situation? • Users – single interface • System Administrators - uniform software configurations • Software Companies • Lower distribution and maintenance costs • Compatibility and file formats Monoculture and Diversity
What are the consequences? • Same vulnerabilities for everyone • One worm/virus for majority of systems • Virus writers also like economy of scale: • “write once, exploit everywhere” Monoculture and Diversity
What can we do ? • opposite of monoculture • diversity • more than one Monoculture and Diversity
Diversity as a defense If we’re not all running exactly the same code: • A single attack cannot compromise everybody • epidemic attacks cease to scale • An attacker won’t know what specific attack to use against a particular target • targeted attacks become more expensive Monoculture and Diversity
How many? Are 10 variants of each piece of software and hardware enough? • normal operations disrupted with only a small fraction of computers attacked • Witty worm • applications show same vulnerabilities across OS's Monoculture and Diversity
We need more.... • We need every system to look different to the attacker • We need all systems to look exactly the same to the users and administrators • We need to be able to deploy and patch systems quickly and economically Monoculture and Diversity
Can we have the benefits without the disadvantages? • Same user interface • Different vulnerabilities • Can the right kind of diversity be generated automatically, without side-effects? Monoculture and Diversity
Roadmap • Threat Model • Classes of attacks • Diversity defences: • Address space randomization • Pointer randomization • Instruction set randomization • Keyed hash functions • Effectiveness of these defences Monoculture and Diversity
Threat Model • Threat: automated, destructive worms • Require quick, automated, remote infection • “Write-once, exploit everywhere” • Assume attacker knows code, but not key material • We are not: • defending against local attackers • defending against expensive brute-force attacks • defending against targeted attacks • Goal: make cost of automated infection high • Crashing program is better than spreading worm Monoculture and Diversity
Classes of Attacks • Code injection attacks • Existing code attacks • Algorithmic complexity attacks Monoculture and Diversity
Code Injection Attacks • Stack Smashing Attack • SQL Code Injection • Perl Code Injection • Double Pointer Attacks Monoculture and Diversity
Stack Smashing return addr main(int argc, char *argv[]) { ... foo(a,b,c); ... if (everything_is_kosher) { exec(“/bin/sh”); } } void foo(int a,int b,int c) { char buf[100]; ... gets(buf); ... } argc argv a b c return addr buf[ ] Monoculture and Diversity
Stack Smashing return addr main(int argc, char *argv[]) { ... foo(a,b,c); ... if (everything_is_kosher) { exec(“/bin/sh”); } } void foo(int a,int b,int c) { char buf[100]; ... gets(buf); ... } argc argv a malicious payload b c return addr buf[ ] Monoculture and Diversity
Stack Smashing return addr • Payload overwrites return address • New address can point to injected code or existing code • The payload can also overwrite local variables • Pointers to code can also occur in other places • virtual functions, callbacks • Runtime type information on the heap can also be overwritten See “Smashing the Stack for Fun and Profit” in Phrack #49 for more argc argv a return addr malicious code b c return addr buf[ ] Monoculture and Diversity
Existing Code Attacks • Format String Attack • Data Modification Attack • Integer Overflow • return-to-libc attacks Monoculture and Diversity
Why do these attacks work? • The way code, stack, and data are laid out in memory is fairly predictable Monoculture and Diversity
Why do these attacks work? • The way code, stack, and data are laid out in memory is fairly predictable: Shared Libraries Stack Heap Code Monoculture and Diversity
Defence Through Diversity • Solution: randomise layout of address space: Shared Libraries Shared Libraries Stack Stack Heap Heap Code Code Monoculture and Diversity
What does this buy us? • This can be done at link time • low overhead • Attacker must know or guess what address to jump to • The starting addresses of code, stack, heap, and library segments add some entropy • On a 32-bit system, about 16 bits for each segment • Is this enough? Monoculture and Diversity
Attacking Address Space Randomization • Attacker needs address of only one function to make successful attack • Information leaks can reveal this • format string vulnerability • 16 bits can be brute-forced • Shacham et al. show how to do this in 216 seconds over a network Monoculture and Diversity
Can we use a larger key? • Can’t get more than 20 bits without changing virtual memory system • We can add padding to stack and code • We can rearrange functions and data structures in memory • but this is tricky for shared libraries • But an attacker needs only one address to succeed • 64-bit address spaces may help Monoculture and Diversity
Address Space Obfuscation and Randomization • start address • reorder • gaps • encryption Monoculture and Diversity
Defenses - Stack • Canary Value • Write/Executable Pages • Padding • Local Variables Reordering • Parameter Reordering Monoculture and Diversity
Defenses – Memory Layout Randomization • Base Address Randomization • stack • heap • text • DLL Monoculture and Diversity
Defenses – Memory Layout Randomization • Reordering of static variables • Reordering of routines • Gaps in heap • Gaps between routines Monoculture and Diversity
Pointer Encryption • Rearranging address spaces doesn’t give us a very large key • Can we have diversity not just in how memory is laid out, but in what pointers mean? • What if we encrypted all pointers in the program? • We could use a larger key • Attacker must guess key in order to overwrite a return address with something meaningful Monoculture and Diversity
PointGuard • Developed by Cowan et al. at Immunix • All pointers stored in memory are encrypted • Pointers are decrypted immediately before dereference • Pointers are encrypted before storing in memory • An attacker must guess key in order to generate valid pointer to attack code Monoculture and Diversity
PointGuard code transformation • Unlike address space transformations, requires compiler changes • Cleartext pointers appear only in registers • Registers are not vulnerable to modification • Encryption must be fast and efficient • We don’t want to encrypt non-pointer data, because that would mean encrypting the buffer containing the attacker’s pointer • Accessing libraries is tricky Monoculture and Diversity
Effectiveness of PointGuard • Overhead is low • but requires recompilation • interaction with non-PG-aware code is tricky • Defends against most code injection and return-to-existing-code attacks • Does not defend against all data modification attacks • Information leaks may reveal ciphertext, allowing attacker to guess key Monoculture and Diversity
What if code gets in anyway? • The previous techniques work by preventing an attacker from jumping to malicious code in the system • What if we didn’t think of every way that could happen? • Defense-in-depth: • make sure injected code won’t run no matter how control is transferred Monoculture and Diversity
What must an attacker know? • An attacker must know how to write code to run on the targeted system • SPARC exploit code will not run on x86 • What if no two computers had the same instruction set? • It would be difficult or impossible to write exploit code that will run everywhere Monoculture and Diversity
Instruction Set Randomization Kc, Keromytis, and Prevelakis: • Encrypt the program’s instructions with a different key for each copy of the program • Decrypt each instruction at runtime immediately before execution • Attacker must know key in order to write code that will decrypt to something meaningful • Unsuccessful attack will cause illegal instruction, address, or raise exception Monoculture and Diversity
How many bits do we need? • Strong symmetric cryptography typically requires a 128-bit key or larger to resist known-plaintext attacks • Large performance penalty to decrypt • If we assume attacker doesn’t have our ciphertext, we can use much smaller key • 32-bit XOR may be good enough if our goal is to prevent large-scale automated worms Monoculture and Diversity
Encoding schemes • XOR: • each word in legitimate code is XORed with the same key • Bit permutation: • The bits in each word are rearranged according to a key: • log2(32!) = 160 bits, for 32-bit word • Can move bits from one instruction to another • In practice, key size is smaller: • more than one way to encode an instruction • more than one harmful instruction Monoculture and Diversity
Variable-sized instructions • x86 instructions vary in size • Some instructions are 1 byte • 8 bit key insufficient • Padding with NOPs has cost • generally requires source code • Solution 1: • Pad branch targets only • Solution 2: • Encrypt words, not instructions Monoculture and Diversity
x86 Implementation • Authors modified Bochs x86 emulator to decrypt code at runtime • Encrypted image consisting of kernel and statically-linked binaries • Cost of emulation is high for CPU-bound processes • Not so bad for I/O bound processes • Reprogrammable processors could reduce overhead (TransMeta Crusoe) Monoculture and Diversity
Interpreted Languages • Some code injection attacks use VBScript, SQL, Perl, or shell languages • Append key material to keywords: • e.g. foreach becomes foreach12345 • Overhead is negligible • The languages are interpreted anyway • Error messages may reveal key Monoculture and Diversity
Libraries • Libraries present a problem: • Use different keys for applications and libraries • Use single key for all system libraries • Change the key from time to time • Or: • Statically link everything so that library code uses same key as application Monoculture and Diversity
Other issues • Self-modifying code won’t run (Yes, gcc sometimes generates this) • Significant performance penalty • Attacker with ciphertext could brute-force the key offline • No defense against local attackers • May be okay for defense against worms • Does not resist existing code attacks • Does not resist data corruption attacks Monoculture and Diversity
Algorithmic Complexity Attacks • The Linux networking code uses hash tables to classify packets • Hash tables, binary trees, and other data structures have good performance in average case • But poor performance in worst case • An attacker who knew the hash function could deliberately generate collisions • This can force worst-case behavior • This can cause denial of service Monoculture and Diversity
Diversity as a Defense • Attacker can find collisions only if he knows hash function • What if every copy used a different hash function? • Solution: keyed hash functions • Every copy uses same code • Every copy uses a different key • Attacker cannot force collisions without key Monoculture and Diversity
Effectiveness • The techniques presented are orthogonal • Other attacks: • integer overflow • data modification • Other threat models: • local attacker • determined remote attacker • denial of service Monoculture and Diversity
Other approaches • StackGuard, StackShield, MemGuard, etc. • bounds checking, canaries, non-executable stack and heap • Safe library routines, wrappers • Sandboxes and safe languages (Java) • Static analysis to detect (or prove the absence of) buffer overflows Monoculture and Diversity
Will this prevent catastrophic failures? 3. Things will be much like they are now: persistent threats, common annoyances, but people will still trust Internet for semi-critical tasks. 4. Technologies have emerged (and been successfully deployed) that make epidemic attacks a thing of the past. The Internet will be trusted for the most critical tasks. Do these techniques give us hope for (4)? Monoculture and Diversity