290 likes | 308 Views
Marple: A Demand-Driven Path-Sensitive Buffer Overflow Detector. Wei Le and Mary Lou Soffa University of Virginia. Motivation: Buffer Overflow. 20 years since exploited by Morris worm Always a popular attack vector
E N D
Marple: A Demand-Driven Path-Sensitive Buffer Overflow Detector Wei Le and Mary Lou Soffa University of Virginia
Motivation: Buffer Overflow • 20 years since exploited by Morris worm • Always a popular attack vector • E.g., 482 new exploitable vulnerabilities 204 buffer overflows reported by SecuriTeam in 2007 • Remain due to legacy code and the fact that many companies still heavily depend on C and C++ 2
Challenge:Reduce attacks • Detect and report where vulnerabilities occur • Determine cause and remove it • Be automatic and usable with manageable manual effort • Scale to large software 3
Our Goals and Overall Approach A framework, Marple, for detecting buffer overflow: • As precise as possible • Helpful for understanding and removing overflow • Scalable • Key idea:Identify paths that lead to buffer overflow • Approach: • Interprocedual path-sensitive for precision and help diagnosis • Demand-driven for scalability 4
Outline of the talk • Value of paths and paths classification • Demand-driven analysis • Vulnerability model • Framework summary • Experiments • Conclusions 5
Paths-Insensitive: Detecting an Overflow i = strlen (a→q_user) 1 i ≥ sizeof (buf0) 2 yes no i ≥ sizeof (buf0) i < sizeof (buf0) 3 buf = xalloc (i+1) buf = buf0 4 buf = xalloc (i+1) V buf0 5 strcpy(buf, a→q_user)
Paths-Sensitive: Detecting an Overflow i = strlen (a→q_user) 1 i ≥ sizeof (buf0) 2 yes no 3 buf = xalloc (i+1) buf = buf0 4 i ≥ sizeof (buf0) buf = xalloc (i+1) i < sizeof (buf0) buf = buf0 strcpy(buf, a→q_user) 5
Paths-Insensitive: Reporting an Overflow 1 y n 2 3 rootd = 1 rootd = 0 4 strlen(wbuf)+rootd+1+ strlen(resolved) > LEN 5 y n exit 6 rootd == 0 y wu-ftpd 2.6.2 realpath.c 7 n strcat(resolved, “/”) 8 strcat(resolved, wbuf)
Paths-Sensitive: Reporting an Overflow Safe Infeasible Overflow 1 y n 2 3 rootd = 1 rootd = 0 4 strlen(wbuf)+rootd+1+ strlen(resolved) > LEN 5 y n exit 6 rootd == 0 y wu-ftpd 2.6.2 realpath.c 7 n strcat(resolved, “/”) 8 strcat(resolved, wbuf)
Five Types of Paths • Infeasible: no input can exercise the path • Safe:no input can overflow the buffer • Vulnerable: users can write any content to the buffer • Overflow-user-independent: the buffer content is statically determinable • Don’t-know:the buffer status cannot be judged statically 10
Demand-Driven Analysis for Buffer Overflow • Two Steps: • Find all potentially overflow statements in the program • Examine paths from a potentially overflow statement to the entry to see if an overflow can occur - backwards • Benefits: scalability and natural parallelism
Vulnerability Model 5-tuple (POS, δ, UPS, γ, r), where POS and UPS are finite sets, and • POS: set of potentially overflow statements • δ: mapping POS->Q, and Q is set of buffer queries • UPS: set of statements where queries are updated • r: mapping UPS->E, where E is set of equations • R: general security policy to judge the termination of the search
Q(s<l, f) Demand-Driven Analysis: An Example Solved char resolved [LEN ] …… 1 Q(LEN<l, f) y n 2 3 rootd = 1 rootd = 0 Infeasible 4 Q(LEN-rootd<l, f) strlen(wbuf)+rootd+1+ strlen(resolved) > LEN 5 exit Q (s+1<l, f) y n 6 rootd == 0 Q(s+1<l, f) s: strlen(resolved)+strlen(wbuf) l: sizeof(resolved) f: wbuf y 7 n strcat(resolved, “/”) 8 strcat(resolved, wbuf)
Program Marple Framework Detect Infeasible Paths POS Raise Queries Queries Propagate Queries Path Classification Source Assist Diagnosis Update Queries Equations Root Cause Information Evaluate Queries Policy no yes Propagate Answers The Vulnerability Model The Demand-Driven Path-Sensitive Analyzer
User Scenario Entry A POS
User Scenario Entry Overflow User Independent Vulnerable POS
User Scenario Entry Overflow User Independent Vulnerable POS
User Scenario Entry Overflow User Independent Vulnerable Root Cause POS
Experiments • Goals • More precisely find vulnerabilities • False positives in vulnerable set • Scalable • Help in diagnosis • Comparison with other tools • Experimental Setup • Microsoft Phoenix, Disolver • BugBench, Buffer Overflow Benchmark, MechCommander2(570.9K) 20
Results: Detection • Detect 14 out of 16 documented overflow • 1 don’t-know : library call • 1 missing: function pointers • Generate1false positivedue tointeger range analysis • Report57new overflows • same path of different buffers
Results: Path-Sensitivity • All types of paths occur • 108 don’t knows from bc • 43 complex pointers • 28 recursive procedures • 15 loops • 12 non-linear operations • 8 library calls
Results: Root Cause • Highlight statements that update • query during analysis as root cause information • Average highlighted less than 10 • Path-sensitive root cause exists
Marple with static tools • Used Buffer Overflow Benchmark – 14 programs “Bad” version – several overflows marked “Good” version – overflows fixed • Static Tools: Archer, Boon, UNO, Splint and Polyspace (commercial tool) • Criteria: probability of detection and probability of false alarms
Marple with static tools Ideal Tool (0,1) Marple-B (0.42, 0.88) PolySpace (0.5,0.87) Marple A - using only Vulnerable/overflow Marple-A (0.04, 0.49) P(d) – Prob of detection Marple B – Marple A + Don’t know Splint (0.43,0.57) ROC Curve Zitser, Lippmann And Leek, FSE ARCHER, UNO BOON P(f) – probability of false alarms
Performance • Visited: 43% of nodes; 52% of procedures • Memory – 2.5GB • Time • MechComander2 (575K lines) – 35.4 minutes • Archer – 121 lines/sec • IPSSA – 155 lines/sec • Marple – 254 lines/sec
Related Work • Static Detection for Buffer Overflow • ARCHER[03xie] BOON[00wagner] ESPx[06hackett] Prefast[ms] Prefix[00bush] Splint[96evans] • Path-Sensitive Analysis for Defects • ARCHER[03xie] ESPx[06hackett] ESP [02das] IPSSA[03livshits] MOPS[02check] Prefix[00bush] • Demand-Driven Approach • A general framework[96Duesterwald] • Application for dataflow computation[96Duesterwald], infeasible detection[97bodik], memory leak[06Orlovich] , postmortem analysis[04Manevich] 27
Conclusions • An interprocedual demand-driven path-sensitive buffer overflow detection for large software • A categorization of paths to assist diagnosis • The identification of vulnerable path segments and the statements relevant to the root cause • Our results demonstrate that Marple is scalable and can report buffer overflow with low false positive rates and rich diagnosis information 28