1 / 30

Post-Attack Analysis of Unknown Vulnerabilities

Post-Attack Analysis of Unknown Vulnerabilities. Peng Ning With Emre C. Sezer, Chongkyung Kil, and Jun Xu. Motivation. Vulnerability analysis Essential for Patching Vulnerability based signature generation Painstakingly slow Depends on human efforts Existing approaches

lew
Download Presentation

Post-Attack Analysis of Unknown Vulnerabilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Post-Attack Analysis of Unknown Vulnerabilities Peng Ning With Emre C. Sezer, Chongkyung Kil, and Jun Xu

  2. Motivation • Vulnerability analysis • Essential for • Patching • Vulnerability based signature generation • Painstakingly slow • Depends on human efforts • Existing approaches • Static analysis (e.g., [Chen et al. 04] , [Feng et al. 04], [Larochelle & Evans 01]) • False positives • Dynamic analysis (e.g., Minos [Crandall et al. 04], TaintCheck [Newsome & Song 05], DIRA [Smirnov & Chiueh 05]) • Used for detection; inadequate vulnerability information • Symbolic execution (e.g., Exe [Cadar et al. 06], DACODA [Crandall et al. 05]) • Scalability issues • Recovery (e.g., STEM [Sidiroglou et al. 05], SEAD [Lacosto et al. 07]) • Change of application semantics 2007 GMU-CSA Workshop

  3. MemSherlock • MemSherlock is an automateddebugger • Automated analysis of unknownmemory corruption vulnerabilities • Appeared in ACM CCS ’07 • MemSherlock provides • Statement that causes the memory corruption • Dynamic program slice leading to the corruption • Program variables involved in the vulnerability • All presented at programming language level • Implications • Generating vulnerability conditions • Improves signature or patch generation speed 2007 GMU-CSA Workshop

  4. Light-weight IDS MemSherlock Trigger Program Instrumented Program Logger Replayer General Framework: Web Application Example Traffic 2007 GMU-CSA Workshop

  5. MemSherlock Overview • Goal is to provide vulnerability information • Intuitive, easy to understand for the programmer • Not only the corruption point • Slice of program involved in the vulnerability • Effects of user inputs • Program variables involved • Variable relationships (e.g., pointer aliasing) • Type of vulnerability (e.g., stack buffer overflow) • MemSherlock performs two important tasks • Finding the corruption point • Tracking program state 2007 GMU-CSA Workshop

  6. MemSherlock: Finding Corruption Point • Observation: A memory object is modified by a small set of statements (inspired by AccMon) • For memory object m, write set of mis the set of statements that legitimately modify m, WS(m) • Security Condition:Memory object m should only be updated by statements in WS(m) 2007 GMU-CSA Workshop

  7. MemSherlock: Assembly Line • Pre-Debugging Phase • Instruments the program for debugging phase • Extracts program information via static analysis • Needs to be performed once • Debugging Phase • Tracks program state • Monitors memory writes and checks for violation of security condition • Tracks tainted data and its propagation 2007 GMU-CSA Workshop

  8. MemSherlock Architecture 2007 GMU-CSA Workshop

  9. Pre-debugging: Generating Write Sets • MemSherlock analyses source code to determine write sets • For a program variable v, WS(v) includes • Assignment statements (i.e., v=expr) • Library function calls where v is passed as an argument that can be modified (i.e., memcpy(&v,src)) • MemSherlock treats DLLs as black boxes • Assumption: A DLL is internally secure, but externally insecure • e.g., no stack overflows in the library functions • Sound for common, well tested libraries (e.g., clib) • Requires library specifications • For each DLL, a list of functions and the arguments they might modify 2007 GMU-CSA Workshop

  10. Dealing with Pointers • For a pointer variable p two write sets are kept • WS(p) – Statements that modify p • WS(ref(p)) – Statements that modify the referent (e.g., *p=5) • ref(p) is resolved during runtime (debugging) • Perform the same analysis for pointer-type function arguments at function calls • Removes the requirement for inter-procedural static analysis 2007 GMU-CSA Workshop

  11. Chained Dereferences • Earlier technique can only handle simple dereferences • Source code rewriting is used to convert all chained dereferences to simple dereferences • Any other dereference that is not simple is converted in the same manner 2007 GMU-CSA Workshop

  12. Output of Pre-debugging Phase • Simplified program • Simplified pointer dereferences • Compiled with debugging options • Input file for the debugger • Program variables and their write sets • Addresses of global symbols • Frame pointer offsets of local variables • Other flags that help the debugger 2007 GMU-CSA Workshop

  13. MemSherlock Architecture: Debugging 2007 GMU-CSA Workshop

  14. Debugging: Dynamic Monitoring • Runtime monitoring • State Maintenance • Incorporates taint analysis from TaintCheck • Produces a dynamic slice of the program leading to the vulnerability • Write Checking • Monitors and validates memory writes • Write sets are file name and line number pairs <f,l> • Instruction pointer IP is translated into <f,l> • Write sets are associated with program variables • A destination address is translated into a program variable 2007 GMU-CSA Workshop

  15. Keeping Program State Virtual Address Space Stack base Stack base main main fnc A fnc A Memory write 0xABABABAB fnc B fnc C Memory write 0xABABABAB Program State 1 Program State 2 • A given memory region may correspond to different program variables depending on program state • Dynamic monitor keeps track of memory mapping 2007 GMU-CSA Workshop

  16. Debugging: Key Data Structures • Keeps two lists of memory regions • ActiveMemoryRegions • Memory corresponding to program variables or their referent memory regions • NonWritableRegions • Saved registers, return addresses, metadata encapsulating dynamically allocated memory regions 2007 GMU-CSA Workshop

  17. Debugging: State Maintenance • Function calls/returns (memory) • Local variable addresses are calculated and added to ActiveMemoryRegions • Location of return address and saved registers are added to NonWritableRegions list • Heap memory (memory) • malloc/free calls are intercepted • Allocated memory is added to ActiveMemoryRegions • The metadata encapsulating the buffer is added to NonWritableRegions • Pointer value updates (write sets) • Searches ActiveMemoryRegions to find the referent and updates its WS 2007 GMU-CSA Workshop

  18. Debugging: Write Checking • When instruction IP modifies memory m • if m is in ActiveMemoryRegions • determines the variable v it belongs to • converts IP into <f,l> • checks if <f,l> is in WS(v) • If the memory write check fails or m is in NonWritableRegions • Marks the operation as a memory corruption • Displays the vulnerability information 2007 GMU-CSA Workshop

  19. Generating Vulnerability Information • The slice of program contributing to the vulnerability • Statements that have propagated tainted values • Statements that have modified related memory regions • Dependency between memory objects involved in the vulnerability • Points to analysis shows memory regions and how they were accessed • Program state • Call stack information • Write set information 2007 GMU-CSA Workshop

  20. Example Test Case: Null HTTP • ~~http.c~~ • 91: void ReadPOSTData(int sid) { • … • 100: conn[sid].PostData=calloc(conn[sid].dat->in_ContentLength+1024, sizeof(char)); • 101: if (conn[sid].PostData==NULL) { ... • 107: do { • 108: rc=recv(conn[sid].socket, pPostData, 1024, 0); • 109: … • Error Report: • --20361-- Error type: Heap Buffer Overflow • --20361-- Dest Addr: 3AB3E360 • --20361-- IP: 0x804E5C7: ReadPOSTData (http.c:108) • --20361-- Dest address resolved to: • --20361-- Global variable "heap var" • @ 3AB3E280 (size: 224) • --20361-- • --20361-- Memory allocated by 0x804E531: • ReadPOSTData (http.c:100) • --20361-- TAINTED destination 3AB3E360 • --20361-- Fully tainted from: • --20361-- 0x804E5C7: ReadPOSTData (http.c:108) • --20361-- • --20361-- TAINTED size used during allocation • --20361-- Tainted from: • --20361-- 0x804E456: ReadPOSTData (http.c:100) • --20361-- 0x804FBB5: read_header (http.c:153) • --20361-- 0x805121B: sgets (server.c:211) 2007 GMU-CSA Workshop

  21. Vulnerability Analysis Example ~~http.c~~ 91: void ReadPOSTData(int sid) { 92: char *pPostData; ... 100: conn[sid].PostData=calloc( conn[sid].dat->in_ContentLength+1024, sizeof(char)); ... 107: do { 108: rc=recv(conn[sid].socket, pPostData, 1024, 0); ... Create Heap Object 2007 GMU-CSA Workshop

  22. Vulnerability Analysis Example ~~http.c:~~ 119: int read_header(int sid) { 121: char line[2048]; ... 127: do { 128: memset(line, 0, sizeof(line)); 129: sgets(line, sizeof(line)-1, conn[sid].socket); ... 153: conn[sid].dat->in_ContentLength=atoi((char *)&line+16); ... 169: if (conn[sid].dat->in_ContentLength<MAX_POSTSIZE) { 170: ReadPOSTData(sid); Object Taint ~~http.c~~ 91: void ReadPOSTData(int sid) { 92: char *pPostData; ... 100: conn[sid].PostData=calloc( conn[sid].dat->in_ContentLength+1024, sizeof(char)); ... 107: do { 108: rc=recv(conn[sid].socket, pPostData, 1024, 0); ... Object Use 2007 GMU-CSA Workshop

  23. Vulnerability Analysis Example ~~http.c:~~ 119: int read_header(int sid) { 121: char line[2048]; ... 127: do { 128: memset(line, 0, sizeof(line)); 129: sgets(line, sizeof(line)-1, conn[sid].socket); ... 153: conn[sid].dat->in_ContentLength=atoi((char *)&line+16); ... 169: if (conn[sid].dat->in_ContentLength<MAX_POSTSIZE) { 170: ReadPOSTData(sid); Create ~~server.c~~ 202: int sgets(char *buffer, int max, int fd) 203: { ... 209: conn[sid].atime=time((time_t*)0); 210: while (n<max) { 211: if ((rc=recv(conn[sid].socket, buffer, 1, 0))<0) { ... Taint Object Taint Object 2007 GMU-CSA Workshop

  24. Implementation • Source code is rewritten using CIL (C Intermediate Language) • CodeSurfer was used to extract program variables and their write sets • A commercial static analysis tool • objdump and dwarfdump were used to extract global symbol information • Dynamic Monitoring is implemented in Valgrind • An open source emulator 2007 GMU-CSA Workshop

  25. Evaluation • Tested 11 real-world applications with known memory corruption vulnerabilities • Test cases included • Stack/Heap buffer overflow, Format string • Both control flow and non-control data attacks • Testing methodology • Programs were run under MemSherlock • Exploit programs were used to attack the applications • Log and replay was not used 2007 GMU-CSA Workshop

  26. Evaluation Results Type abbreviations: (S)tack overflow, (H)eap overflow and (F)ormat string 2007 GMU-CSA Workshop

  27. False Negatives • Prozilla: • memcpy uses a kernel function to manipulate page tables when copying entire pages • Valgrind cannot trace into kernel • Can be prevented by function wrappers • Other false negatives are theoretically possible • structs within unions or arrays • Current implementation does not support unions • Currently do not differentiate between elements of an array • Memory corruption errors inside DLLs 2007 GMU-CSA Workshop

  28. False Positives • Embedded assembly • Incomplete library specification • library functions keeping internal state (e.g., strtok(Null, delim) ) • library functions that modify global variables as side effects (e.g., optarg, errno) • pointers that point to hidden global structures (e.g., getdatetime() in time.h) • struct pointers • void pointers that are type-cast to modify struct variables • since the pointer is not of type struct, MemSherlock fails to update accordingly 2007 GMU-CSA Workshop

  29. Conclusion • Fully automated vulnerability analysis • The analysis output is intuitive and human readable • Future Challenges • Automated, long-term fix of vulnerabilities • Semantic consistency is a great challenge • Automated, temporary fix of vulnerabilities • Generating vulnerability condition • Improving signature generation 2007 GMU-CSA Workshop

  30. Thank You

More Related