210 likes | 367 Views
The 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach. Zhiqiang Lin Xiangyu Zhang, Dongyan Xu Purdue University June 27 th , 2008. FC. User. Motivation.
E N D
The 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach Zhiqiang Lin Xiangyu Zhang, Dongyan Xu Purdue University June 27th, 2008
FC User Motivation Internet Worms (CodeRed, Slammer) Vulnerability In Software DoS DoS Accidental Breaches in Security Viruses, Trojan Horses, Bots (Botnet) Denial of Service (DoS)
Related Work • Dynamic analysis • Program shepherding (V. Kiriansky et al.) TaintCheck (J. Newsome et al.) Control Flow Integrity (M. Abadi et al.) Data Flow Integrity (M. Castro et al.)… • Run-time overhead, and waiting for attack • Static analysis • BOON (D. Wagner et al.), Splint (D. Larochelle et al.), Archer (Y. Xie et al.), RATS, Flawfinder • False positive • Recent automated multi-path exploration • DART (P. Godefroid et al.), Cute (K. Sen et al.), EXE (C. Cadar et al.), SAGE (P. Godefroid et al.) • Low Efficiency
Problem Statement and Our Technique • How to more efficiently discover/convict software vulnerability • An Efficient Input Provenance Based Approach • Conservative static analysis => Suspect • Dynamic analysis => Convicting the suspect and pruning false positives • Randomly mutation is avoided • No symbolic execution (can handle long execution) • Key idea • Data lineage tracing (Input Provenance)
Basic Idea Input Data label (Offset): 6 7 8 9 fread(&imagehed,sizeof(imagehed),1,in); ... width=(imagehed.wide_lo+256*imagehed.wide_hi) height=(imagehed.high_lo+256*imagehed.high_hi); ... if((...(byte *)malloc(width*height))...) { fclose(in); return(_PICERR_NOMEM); } ... 231 245 246 494 495 496 497 498 Input a.gif (256x128):xx...0x00 0x01 0x80 0x00... Integer Overflow • An image viewer: Zgv-5.8/readgif.c
Architecture Input Lineage Tracer Program Input Lineage Program/ binary Run-time Detector Static-front End Input Mutator New Input Suspect Evidence A piece of instruction which is exploitable to trigger the vulnerability
Component 1. Input Lineage Tracer • Label the input stream (using the offset) • Track their propagation mov 0xfffffffc(%ebp),%eax mov %eax, 0xfffffff8(%ebp) add %eax, %ecx mov %ecx, %edx
Component 1. Input Lineage Tracer • Key concept • Data Dependency (direct propagation) • Control dependency (indirect propagation) mov 0xfffffffc(%ebp),%eax mov %eax,0xfffffff8(%ebp) • b=a • 1. b=a; • a==1 cmpl $0x1,0xfffffffc(%ebp) jne 804832d <main+0x25> • b=1 movl $0x1,0xfffffff8(%ebp) • 1. if (a==1) • 2. b=1; • 3. else • 4. c=0; jmp 8048334 <main+0x2c> • c=0 movl $0x0,0xfffffff4(%ebp)
Component 1. Data Lineage Tracer Input data tracking (labeled with its offset in the input stream) • DL(Si)=DL(def@si) • DL(def@si) = get_new_id() if def is an input value U DL(usex@si) otherwise DL Representation: reduced ordered Binary Decision Diagram (roBDD)
Component 1. Data Lineage Tracer • An Example DL(width@245) = DL(wide_hi@pc245) U DL(wide_lo@pc245) = {6; 7} READ (buf,size,...), 0<= i < size , buf[i], DL(buf[i]@pc231) = get_new_id() 231 245 246 494 495 496 497 498 DL(height@246) = DL(high_hi@pc246) U DL(high_lo@pc246) = {8; 9} fread(&imagehed,sizeof(imagehed),1,in); ... width=(imagehed.wide_lo+256*imagehed.wide_hi) height=(imagehed.high_lo+256*imagehed.high_hi); ... if((...(byte *)malloc(width*height))...) { fclose(in); return(_PICERR_NOMEM); } ... DL(wide_lo@pc245)= DL(buf[6]@pc231) = {6} DL(wide_hi@pc245)=DL(buf[7]@pc231) = {7} DL((width*height)@494) = {6;7;8;9}
Component 2. Input Mutator Program Input Evidence Data Lineage Suspect Heuristics#1: Buffer overflow mutation (double buffer size …) Heuristics#2: Format string mutation (replace %s in format string argument) Heuristics#3: Integer overflow mutation (Boundary integer value: 0xffffffff,0,0x0fffffff) …
Implementation • Diablo: • Control flow graph • Statically generate Control dependency to facilitate Valgrind instrumentation • http://diablo.elis.ugent.be/ • Valgrind: • Lineage tracing • http://valgrind.org/ • RoBDD (Reduced ordered Binary Decision Diagram) to represent the data lineage.
Evaluation - Effectiveness • Static Detector • Known vulnerability • CVE-2001-1413 (ncompress 4.2.4, SO) • CVE-2001-1228 (gzip 1.2.4, SO) • CVE-2002-1496 (Nullhttpd 0.50, HO) • CVE-2002-1549 (lhttpd 0.1, SO) • CVE-2000-0573 (wu-ftpd-2.6.0, Format String) • CVE-2001-0609 (cfingerd-1.4.3, Format String) • CVE-2005-0226 (ngircd-0.8.2, Format String) • CVE-2004-0904 (xzgv-0.8, IO & HO) • CVE-2006-3082 (GnuPG 1.4.3, IO & HO) • RATS (Unknown) • Make extension to catch: buffer overflow, integer overflow (ipgrab-0.99, epstool-3.3, dcraw-7.94)
Evaluation - CVE-2006-3082 (GnuPG 1.4.3) • GnuPG Parse_User_ID Remote Buffer Overflow Vulnerability pktlen=in[2,3,4,5] =0x ff ff ff ff
Evaluation - CVE-2001-0609 (Cfingerd-1.4.3) syslog(LOG_NOTICE, "%s", (char *) syslog_str);
Evaluation – Performance (Lineage Tracing) Platform: two 2.13 Ghz Pentium processors and 2G RAM running the Linux kernel 2.6.15
Summary • An input lineage tracing and mutation system: • Capable of convicting known and unknown vulnerability. • Has reasonable overhead for the scenario of offline vulnerability conviction. Data Lineage Tracer Program Input Lineage New Input Program/ binary Run-time Detector Static-front End Input Mutator Suspect Evidence
Q & A Thank you For more information: {zlin, xyzhang, dxu}@cs.purdue.edu