310 likes | 321 Views
This research investigates a process coloring technique for malware investigation, preserving and exploiting break-in provenance information to better understand and analyze malware. The research explores both server-side and client-side malware investigation and proposes potential solutions for color saturation issues.
E N D
NICIAR Local Site Visit Annapolis Junction, MD, January 25, 2008 Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu, Ryan Riley Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang Department of Computer Science George Mason University
Outline • Project overview • Results in 1st and 2nd quarters • Open problems and potential solutions • Plan for 3rd quarter
Motivation • Internet malware remains a top threat • Malware: virus, worms, rootkits, spyware, bots…
Log Technical Approach: Process Coloring (PC ) • Key idea: propagating and logging malware break-in provenance information (“colors”) along OS-level information flows • Existing tools only consider direct causality relations without preserving and exploiting break-in provenance information Virtual Machine Log Monitor Apache MySQL DNS Sendmail Logger Guest OS Virtual Machine Monitor (VMM)
New Capabilities Enabled by PC Capability 1: Color-based malware warning Initial coloring s30sendmail s30sendmail s55sshd s55sshd Syscall Log s45named s45named init rc s80httpd s80httpd • /etc/shadow • Confidential Info httpd netcat Capability 3: Color-based log partition for contamination analysis Local files /bin/sh Capability 2: Color-based identification of malware break-in point Coloring diffusion wget Rootkit
Status: Color diffusion model instantiated in prototype. create, mkdir, link create <s1, o1> color(o1) = color(s1) CREATE fork, vfork, clone create <s1, s2> color(s2) = color(s1) color(s1) = color(s1)υcolor(o1) read <s1, o1> read, readv, recv READ read <s1, s2> ptrace color(s1) = color(s1)υcolor(s2) color(o1) = color(s1)υcolor(o1) write <s1, o1> write, writev, send WRITE write <s1, s2> Ptrace, wait, signal color(s2) = color(s1)υcolor(s2) destroy <s1, o1> unlink, rmdir, close DESTROY destroy <s1, s2> exit, kill Task I: Color Diffusion Model Definition (Month 1-6) • Color Diffusion Model Diffusion syscalls Operation
Status: Server-side malware investigation supported by prototype. Task II: PC for Server Side Malware Investigation (Month 2-6) • Server-side malware investigation • Consolidated server with independent server applications • “Clustered” information flows partitioned by server application colors • Color mixing highly unlikely between applications
Status: Causes of color saturation identified and being studied. Task III: PC for Client Side Malware Investigation (Month 6-18) • Client-side malware investigation • Inter-dependent client applications (e.g., text editor compiler; latex dvips ps2pdf) • More inter-application information flows • Color saturation a serious concern
PC Prototype • Basic release packaged • On Xen 3.0.4_1 • Targeting server apps • For testing: Good • For production: Bad • Bugs and un-optimized
PC Prototype Challenge - Logging • Log data “trafficking” • Harder than you think • Not optimal • Problem in Xen community
PC Prototype Challenge - Logging • Details of our solution • Shared pages • Ring buffers • Host kernel buffers • 2 or 3 copies (to be optimized…)
3 4 2 5 Dom0 (Host) DomU (Guest) Log Entry Log Entry 6 1 Ring Buffer-based Solution Xen VMM Shared Memory
Prototype Status • First version completed w/ simple documentation • Initial experiments with • Malware (e.g., Lion worm) • Untrusted application (e.g., Skype) Demo!
Research Challenge: PC on Client Side • Focus of ongoing investigation • Much harder than server side • Work-in-progress • Observing client-side color mixing via experiments • Identifying root causes of color saturation • Investigating approaches to mitigation
Two Root Causes of Color Saturation • Important observation from experiments • It’s all about sinks: • Sink processes • Sink files • Guidance to mitigating color saturation
GUI Manager GUI Manager Firefox Thumbnailer Thumbnailer Sink Process – 1st Example The “GUI” experiment Document.gif Document.pdf
Sink Process – 2nd Example The “File Preview” experiment
OpenOffice OpenOffice OpenOffice Sink Process – 2nd Example The “File Preview” experiment Employees.doc Finances.doc Finances.doc Preview Read Write
Sink File Example The “Configuration File” experiment
OpenOffice OpenOffice OpenOffice OpenOffice Sink File Example The “Configuration File” experiment Employees.doc Finances.doc Finances.doc Recently Used
1st Possible Approach Insulate the sink Example – Sink process Cannot inherit Cannot pass on Children insulated until exec() Issues Implicit trust Easier covert channels
1st Possible Approach – Covert Channels • .recently-used <?xml version="1.0"?> <RecentFiles> <RecentItem> <URI>file:///DTO_Purdue_012508.ppt</URI> <Mime-Type>application/vnd.ms-powerpoint</Mime-Type> <Timestamp>1200410028</Timestamp> <Groups> <Group>openoffice.org</Group> <Group>staroffice</Group> <Group>starsuite</Group> </Groups> </RecentItem> </RecentFiles>
2nd Possible Approach Use program execution context information Instead of always insulating a sink Diffuse color if [context predicate]holds Program execution context System call type and parameter System call timing Call stack Issues Requires execution context information Feasible? Fake-able?
2nd Possible Approach - Call Stack • Example: • Imagine we require…main() -> terminate() -> write_config() -> write() • Can an attacker fake this call stack?
3rd Possible Approach Leverage program-level information flows Available from peer NICIAR projects Determining if data “really” used Passing this information to PC Diffusing color if process really uses data In talk with SWRI+UT team Issues Requires program source code
4th Possible Approach Leverage application-level virtualization (e.g., MS SoftGrid, Trustware BufferZone) Group, isolate, and confine related processes Issues Handling valid interaction Usability
Discussion • Potential points for discussion • Universality? • Trust in application? • Features vs. information flow? • Will flow problems continue to increase?
Plan for 3rd Quarter • Detailed design of PC for client side • Implement some proposed approaches • Evaluation of each approach • Observing color mixing in a client • Experimenting with malware instances (e.g., bots)
Thank you! For more information about the ProcessColoring project: http://friends.cs.purdue.edu/projects/pc PC@cs.purdue.edu