390 likes | 542 Views
Process Coloring: An Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu, Ryan Riley Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang
E N D
Process Coloring: An Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu, Ryan Riley Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang Department of Computer Science George Mason University NICIAR Site Visit, West Lafayette, IN, July 25, 2008
Outline • Project overview and Heilmeier Q&A • Quarterly update and demo • “PC+DDFA” integration • Administrative issues
Log Process Coloring (PC) Overview • Key idea: propagating and logging application provenance information (“colors”) along OS-level information flows • Existing tools only consider direct causality relations without preserving and exploiting application provenance information Virtual Machine Log Monitor Text Editor File Manager Web Browser Tax Express Logger Guest OS Virtual Machine Monitor (VMM)
PC Usage Scenario: Server-Side Malware Attack Capability 1: PC malware alert “No shell process should have the color of Apache” Initial coloring s30sendmail s30sendmail s55sshd s55sshd Syscall Log s45named s45named init rc s80httpd s80httpd • /etc/shadow • Confidential Info httpd netcat Capability 3: Color-based log partition for contamination analysis Local files /bin/sh Capability 2: Color-based identification of malware break-in point Coloring diffusion wget Rootkit
PC Usage Scenario: Client-Side Malware Attack www.malicious.net turbotax Tax warcraft Games notepad Editor firefox Web Browser PC malware alert “Web browser and tax colors should never mix” Agobot Tax files Agobot
PC Usage Scenario: Client-Side Sensitive Data Protection turbotax Tax warcraft Games notepad Editor PC data theft alert “Tax file should never leave this computer” outlook Email Tax files Date files Tax files Data files This is not as simple as it sounds!
Heilmeier Question 1:What are you trying to do? • Tracking and logging OS-level information flows • Being extended to both OS and language levels (“PC+DDFA”) • Tainting processes and data with application provenance information (“colors”) for • Detecting and investigating malware activities • Enforcing sensitive data protection policies • Using virtualization for stronger tamper-resistance • Taking logging and real-time detection to outside
Heilmeier Question 2:How is it done now? • Information flow tracking at multiple levels • OS level • Only considering direct causality in each system call • No provenance (“color”) tainting and propagation • Language level • Only tracking information Flow within a program • No information flow tracking across programs • Instruction level • Difficult to understand attack semantics • Significant runtime performance overhead
Heilmeier Question 3:What’s new and why will it succeed? • What’s new? • Color-based malware alert and sensitive data protection • Supporting both on-line detection and off-line forensics • Stronger tamper-resistance and non-stop VM operation • One of the first to combine OS and language-level information flows • Why will it succeed? • Practical, deployable system based on classic theory • Running prototype showing effectiveness and practicality • Technical challenges identified and addressed • Attracting external interests (SWRI, Lockheed Martin)
Heilmeier Question 4:If successful, what difference will it make? • An extensible, system-level framework for attack/violation detection, investigation and recovery • Specification and enforcement of log and color-based policies for malware alert and data protection • Lower false positive and false negative rates; more timely detection; higher investigation efficiency • Ready for virtualization-based infrastructures (e.g. honeynets, enterprises and data centers)
Heilmeier Question 5:Your timeline, cost and success metrics? • Timeline • Cost: $xxx,xxx ($xxx,xxx subcontract) • Success metrics • Accuracy, efficiency and timeliness (more later) 6/2007 12/07 6/08 12/08 - Basic PC prototype for server-side operation - PC prototype for client-side operation (“brown problem” solution) - Set up “living lab” VM for evaluation - Extensive evaluation - Design, prototyping and demonstration of “PC+DDFA” integration • - Recovery and replay • - PC across machines • - Data lifetime analysis for data theft defense
Summary of Achievement • Improved sink insulation implementation • Cleaned up log management and visualization • Set up “living lab” client VM for evaluation • Preliminary design for “PC+DDFA”
Browser Finance Doc Edit Browser Doc Edit Color Saturation Mitigation (Brown Problem) Policy:“Data written by financial application should not be read by applications that can transmit it outside of the system.” False Alarm notes.txt .recently_used Finances.pdf
Zoom-in View of Sink File F1040.pdf
Sink File Insulation • Some files become color sinks • Examples: • .recently_used • .gnome2/accels/evince • .gnome2/accels/gedit • Color propagated unnecessarily • Simply “insulate” these sinks
Zoom-in View F1040.pdf
“Living Lab” VM for Evaluation • A Linux VM running on Xen • System configuration: • 256MB RAM • 1.8GHz CPU • Connected to the Internet • Applications: • Firefox • OpenOffice • Standard GNOME applications • To be used daily by Ryan (more users in the Fall)
“Living Lab” VM: Demo A live Demo
Evaluation Metrics – Accuracy of Alerts • False positive and false negative rates • Living lab experiment • Specify malware detection and sensitive data protection policies • Analyze alerts raised (true or false) • Attack injection experiments • Specify malware detection and sensitive data protection policies • Launch malware instances or sensitive data thefts • Count number of instances caught and missed
Evaluation Metrics - Efficiency • System runtime efficiency • Performance of LMBench, UNIXBench and ApacheBench • w/ process coloring • w/o process coloring • Malware investigation efficiency • Number of colors in alert-raising log entry • Total number of colors in system • % of log entries w/ “problematic” color(s)
Evaluation Metrics – Timeliness of Alerts • Measure the interval between • A malware attack or data protection breach • Its detection • Duration of interval depending on malware behavior (in-action or dormant)
Technology Transfer • Within NICECAP Program (ongoing) • “PC+DDFA” integration with SWRI/UTexas team • To Lockheed Martin (ongoing) • Target environment: Virtual honeynet architecture with both server and client VMs • PC a good fit for attack detection, monitoring and investigation • Effort starting this summer
Summary of Integration Activities • Held multiple meetings with SWRI/UTexas team • Identified motivating usage scenarios • Defined API between PC and DDFA • Planned detailed design and implementation
A Motivating Scenario turbotax Tax warcraft Games PCfalse alert “Sensitive file should never leave this computer” notepad Editor outlook Email Sensitive Date files Tax files My photo File Manager Sink file insulation doesn’t help…
PC or DDFA Alone Cannot Solve It PC Process-level information flow treating processes as blackboxes Overly conservative color tainting Color tainting across processes DDFA Language-level information flow confined within one process Not aware of colors across the system Fine-grain data flow tracking within a process
Example: Without “PC+DDFA” Integration Process Process File 1 New file File 2
Example: With “PC+DDFA” Integration push_color(new_file, ) File 1 New file Process (w/ DDFA) New file File 2 fetch_color(file1) fetch_color(file2) Process Coloring (Operating System level)
Prototyping Plan • SWRI+UTexas • Making DDFA color-aware • Instrumenting a real-world file manager PCManFM with DDFA capability • Purdue • Implementing fetch_color()and push_color()in PC • Testing instrumented PCManFMin living lab VM • To show a joint demo before end of project
1. Moving the Subcontract • GMU Subcontract PI Xuxian Jiang will move to North Carolina State University in August • Remaining balance in subcontract $xx,xxx • Seeking approval moving the subcontract to NCSU
2. No-cost Extension • Purdue balance as of 7/22/2008: $xxx,xxx.xx • Not expected to run out by 12/06/2008 • Inquiring about possibility of no-cost extension
LSSD Process Coloring (PC) For Malware Alert and Investigation- An OS-level Information Flow Preserving Approach • APPROACH • Track OS-level information flows • Taint processes/data based on their influence between each other • Record color(s) in log entries • Integrate with intra-process DDFA • NEW CAPABILITIES • Color-based malware alert • Color-based malware break-in point identification • Color-based log partitioning • PLAN / PROGRESS • Model process color diffusion in real OS (done) • Demonstrate PC prototype in a malware scenario • Includes both server (done) and client (done) side solutions • Mitigate color saturation effect in malware alert • Profiling and visualization (done) • Reducing false positives caused by legitimate color mixing (done) • Proof-of-concept demo of “PC+DDFA” (Dec.08) • Evaluate PC in “living lab” VMs (July.08 – Dec.08) • APPLICATIONS • System monitoring and malware (e.g. bots) detection • Malware forensics • Sensitive data protection
Thank you! For more information about the ProcessColoring project: http://friends.cs.purdue.edu/projects/pc PC@cs.purdue.edu