550 likes | 691 Views
Binary Concolic Execution for Automatic Exploit Generation. Todd Frederick. Vulnerabilities are everywhere…. An exploit. 1987. shell#. Finger Server. Robert Morris. DD8F2F736800DD8F2F62696ED05E5ADD00DD00DD5ADD03D05E5CBC3B. DD8F2F736800DD8F2F62696ED05E5ADD00DD00DD5ADD03D05E5CBC3B. rtm.
E N D
Binary Concolic Execution for Automatic Exploit Generation Todd Frederick
Vulnerabilities are everywhere… Binary Concolic Execution
An exploit 1987 shell# Finger Server Robert Morris DD8F2F736800DD8F2F62696ED05E5ADD00DD00DD5ADD03D05E5CBC3B DD8F2F736800DD8F2F62696ED05E5ADD00DD00DD5ADD03D05E5CBC3B rtm Binary Concolic Execution
The problem: exploiting vulnerable code • Find an exploit state in a program • Use a known existing vulnerability • Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006] • Find an exploit state in a program • Use a known existing vulnerability • Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006] • Find input that drives the program down a path to the exploit state • Analyze program control flow • Walk through the program, finding inputs to reach the current point • Explore paths in the program to reach the vulnerability Binary Concolic Execution
The problem normal input exploit Program Assume we know of a vulnerability Binary Concolic Execution
Running example Program login: good bad password: Using backdoor! Binary Concolic Execution
Working with binary code exploit Program 8048282: lea 0x4(%esp),%ecx 8048286: and $0xfffffff0,%esp 8048289: pushl 0xfffffffc(%ecx) 804828c: push %ebp 804828d: mov %esp,%ebp 804828f: push %ebx 8048290: push %ecx 8048291: sub $0x10,%esp 8048294: call 8048210 <prompt> 8048299: mov $0x3,%eax 804829e: mov $0x0,%ebx 80482a3: mov $0x80bd884,%ecx 80482a8: mov $0x10,%edx 80482ad: int $0x80 80482af: mov %eax,0xfffffff0(%ebp) 80482b2: movzbl 0x80bd886,%eax 80482b9: movsbl %al,%edx 80482bc: movzbl 0x80bd884,%eax 80482c3: movsbl %al,%eax 80482c6: mov %edx,%ecx 80482c8: sub %eax,%ecx 80482ca: mov %ecx,%eax 80482cc: cmp $0x2,%eax 80482cf: jne 8048302 <main+0x80> 80482d1: movzbl 0x80bd886,%eax 80482d8: movsbl %al,%edx 80482db: movzbl 0x80bd885,%eax 80482e2: movsbl %al,%eax 80482e5: mov %edx,%ecx 80482e7: sub %eax,%ecx 80482e9: mov %ecx,%eax 80482eb: cmp $0x3,%eax 80482ee: jne 8048302 <main+0x80> 80482f0: movzbl 0x80bd886,%eax 80482f7: cmp $0x64,%al 80482f9: jne 8048302 <main+0x80> 80482fb: call 804825c <backdoor> 8048300: jmp 8048307 <main+0x85> 8048302: call 8048236 <login> 8048307: mov $0x1,%eax 804830c: mov $0x0,%ebx 8048311: int $0x80 8048313: mov %eax,0xfffffff4(%ebp) 8048316: mov $0x0,%eax 804831b: add $0x10,%esp 804831e: pop %ecx 804831f: pop %ebx 8048320: pop %ebp 8048321: lea 0xfffffffc(%ecx),%esp 8048324: ret Binary Concolic Execution
Conceptual approach Symbolic Execution Program Generated Input • Run program, tracking variables as expressions instead of actual (concrete) values • Collect expressions along the current path • Find concrete input to satisfy these expressions Binary Concolic Execution
Conceptual approach Symbolic Executor Solver Program Path Conditions Generated Input • Run program, tracking variables as expressions instead of actual (concrete) values • Collect expressions along the current path • Find concrete input to satisfy these expressions Binary Concolic Execution
Conceptual approach Symbolic Executor Solver Program Path Conditions Generated Input Path Selector • Exponential number of paths • Limit and prioritize the paths we will explore Binary Concolic Execution
Traditional symbolic execution read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Traditional symbolic execution Symbolic Memory buffer: input[0],input[1],input[2] read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Traditional symbolic execution Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] != 2 Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Traditional symbolic execution Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] != 3 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Traditional symbolic execution Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] == ‘d’ Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] != ‘d’ read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Problems with symbolic execution • Must maintain exponentially many symbolic states • Expressions may be difficult or unfeasible to solve Solution: Run program concretely and symbolically Concrete Execution Concolic Execution Symbolic Execution Binary Concolic Execution
Concolic execution overview Input Concrete Executor Symbolic Executor Solver Instructions Program Path Conditions Generated Input Path Selector • Symbolic execution follows concrete path • Some expressions use concrete values Binary Concolic Execution
Concolic execution • Advantages • Track less state in parallel by following a single path at a time • Simplify expressions by substituting concrete values for difficult sub expressions • Disadvantage • Concrete values only hold for a specific set of concrete inputs, so mixing concrete values and expressions may produce inaccurate expressions Binary Concolic Execution
Concolic execution example Input good Symbolic Memory buffer: read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input good Symbolic Memory buffer: input[0],input[1],input[2] read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: g,o,o,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input good Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] != 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: g,o,o,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input good Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: g,o,o,d if( input[2] == ‘d’ ) backdoor() login() Generated Input egg Binary Concolic Execution
Concolic execution example Input egg Symbolic Memory buffer: read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input egg Symbolic Memory buffer: input[0],input[1],input[2] read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: e,g,g if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input egg Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: e,g,g if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input egg Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] != 3 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: e,g,g if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input egg Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: e,g,g if( input[2] == ‘d’ ) backdoor() login() Generated Input port Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: input[0],input[1],input[2] read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: p,o,r,t if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: p,o,r,t if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: p,o,r,t if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] != ‘d’ read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: p,o,r,t if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input port Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] == ‘d’ read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: p,o,r,t if( input[2] == ‘d’ ) backdoor() login() Generated Input bad Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: input[0],input[1],input[2] read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: b,a,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: b,a,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: b,a,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] == ‘d’ read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: b,a,d if( input[2] == ‘d’ ) backdoor() login() Binary Concolic Execution
Concolic execution example Input bad Symbolic Memory buffer: input[0],input[1],input[2] Path Condition input[2]-input[0] == 2 && input[2]-input[1] == 3 && input[2] == ‘d’ read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) Concrete Memory buffer: b,a,d if( input[2] == ‘d’ ) backdoor() login() Success Binary Concolic Execution
Inaccurate expressions • Some variables depend on input • Replacing these variables with concrete values may yield inaccurate expressions • Solving an inaccurate path condition may produce input that does not take the desired path Binary Concolic Execution
Concolic execution system design Program Input Concrete Executor Symbolic Executor Solver Instructions Path Conditions Generated Input Path Selector Binary Concolic Execution
Concolic execution system design Program Input Concrete Executor Symbolic Executor STP(Solver) Instructions Path Conditions Generated Input Dyninst SymEval ProcControlAPI Path Selector Binary Concolic Execution
Concrete execution components Concrete Executor Dyninst ProcControlAPI Binary Concolic Execution
Concrete execution components • Concrete Executor • Redirects program input • Reads actual values of instruction operands • Tracks path taken • Dyninst • Assists with static analysis • ProcControlAPI • Runs program using single-stepping or breakpoints Binary Concolic Execution
Concolic execution system design Program Input Concrete Executor Symbolic Executor STP(Solver) Instructions Path Conditions Generated Input Dyninst SymEval ProcControlAPI Path Selector Binary Concolic Execution
Symbolic execution components Symbolic Executor SymEval Binary Concolic Execution
Symbolic execution components • Symbolic Executor • Symbolic memory • Identify input • Update symbolic memory • Extract conditional predicates • SymEval • Represents instruction semantics as ASTs Binary Concolic Execution
Concolic execution system design Program Input Concrete Executor Symbolic Executor STP(Solver) Instructions Path Conditions Generated Input Dyninst SymEval ProcControlAPI Path Selector Binary Concolic Execution
Path searching components STP(Solver) Path Conditions Path Selector Binary Concolic Execution
Path searching components • STP(Solver) • Designed for program analysis applications • Handles bit-vector data types • Path Conditions • One term for each branch taken • Path Selector • Decides where to branch off from current path • Is a depth-first search for now • Other strategies will use static CFG analysis Binary Concolic Execution