300 likes | 425 Views
CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C. Nurit Dor (TAU), Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU). Greta Yorsh (TAU)?. Seminar in Program Analysis for Cyber-Security Ittay Eyal , March 2011. High-Level Structure. 2. Example.
E N D
CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C NuritDor (TAU), Michael Rodeh (IBM Research Haifa), MoolySagiv(TAU) Greta Yorsh (TAU)? Seminar in Program Analysis for Cyber-Security IttayEyal, March 2011
Example void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText) { INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return; } 3
Core C • Control-flow statements: • if, goto , break, or continue • Expressions are side-effect free and cannot be nested • All assignments are statements • Declarations do not have initializations • Address-of formal variables is not allowed 4
void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText) { INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return; } void SkipLine(intNbLine, char** PtrEndText) { intindice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) gotoend_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ‘\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; gotobegin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ‘\0’; } 5
Contracts • Describe input, side-effects and output: • Requires • Modifies • Ensures 6
void SkipLine(intNbLine, char** PtrEndText) requires is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0 modifies *PtrEndText *PtrEndText.is_nullt *PtrEndText.strlen ensures *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText]pre + NbLine; void SkipLine(intNbLine, char** PtrEndText) { intindice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) gotoend_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; gotobegin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; } 7
void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } 8
Requires: is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0 Modifies: *PtrEndText, *PtrEndText.is_nullt, *PtrEndText.strlen Ensures: *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText]pre + NbLine; void SkipLine(intNbLine, char** PtrEndText) { intindice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) gotoend_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; gotobegin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; } void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } 9
12 void SkipLine(intNbLine, char** PtrEndText) void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); }
P inline(P) 13 • Function Entry point: • Assume pre-conditions. • Store inputs ([x]pre) in temporary variables for post-conditions check. • Return: • Set return_valueP. • Function exit: • Assert post-conditions. • Function call and its result assertion: • Assert pre-conditions. • Assume post-conditions (possibly w.r.t. inputs).
Pointer Analysis • The target – determine which objects may be updated through a pointer. • Whole program points-to state is calculated. • Then per-procedure. 14
Pointer Analysis foo(char *p, char *q) { char local[100]; … p = local; *q = 0; … } main() { char s[10], t[20], r[30]; char *temp; foo(s,t);foo(s,r); … temp = s … } local p q s t r temp 15
Pointer Analysis Parametrization for foo foo(char *p, char *q) { char local[100]; … p = local; *q = 0; … } main() { char s[10], t[20], r[30]; char *temp; foo(s,t);foo(s,r); … temp = s … } local p q PARAM #2 PARAM #1 16
C2IP • Inline(P) • Pointer info Integer Program l.val: possible values. l.offset: w.r.t. base address. l.aSize: Allocation size. l.is_nullt: Null terminated? l.len: String length (with \0) 18
C to Integer Program Expression Check 19
C to Integer Program Constructs to Statements 20
C to Integer Program Notation V: the number of variables and allocation sites. S: the number of C expressions. Integer Program Complexity O(V) constraint variables Each pointer may point to O(V) locations Total complexity: O(S V) 21
Integer Analysis • Calculates the inequalities that hold at each point. • Conservative. • Each assertion is verified against the inequalities. 22
Integer Analysis void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } *PtrEndText.alloc > NbLine 23
Integer Analysis - Contracts To optimize the contracts, do the following: Assume True preconditions Use ASPost [1] to calculate the linear inequalities at the exit point Deduce the postconditions. Use AWPre to calculate backwards the most liberal preconditions. 24 [6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang., 1978.
Implementation C CoreC: Based on the AST-Toolkit [32] Points-to analysis: Golf [8, 9] Integer analysis: Polyhedra library [6, 19] [6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang., 1978. [8] M. Das. Unification-based pointer analysis with directional assignments. In SIGPLAN Conf. on Prog. Lang. Design and Impl., 2000. [9] M. Das, B. Liblit, M. F¨hndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Static Analysis Symp., 2001. [19] B. Jeannet. New polka library. Available at “http://www.irisa.fr/prive/Bertrand.Jeannet/newpolka.html”. [32] Microsoft Research. AST-toolkit. 2002. 25
Empirical Results • Source from two real-world projects: • String manipulation library from EADS Airbus code. 11 procedures, 400 lines. • Part of the WEB2c converter. 8 procedures, 460 lines. 26
Conclusion • Not easy to analyze C. • Plenty of techniques and tools. • High false positive ratio - • without hand-crafted contracts. • Experimental results section slim. • High variance for little data. • (They had to write all contracts…) • What would happen to normal code? 30