430 likes | 580 Views
Research in Next-Generation Digital Forensics. Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science golden@cs.uno.edu http://www.cs.uno.edu/~golden. Digital Forensics Research Group. Fall 2006: Thursdays @ 1pm in NSSAL (Math 322) Primary Collaborators:
E N D
Research in Next-Generation Digital Forensics Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science golden@cs.uno.edu http://www.cs.uno.edu/~golden
Digital Forensics Research Group • Fall 2006: • Thursdays @ 1pm in NSSAL (Math 322) • Primary Collaborators: • Vassil Roussev [UNO CS] • Vico Marziale [UNO Ph.D. student] • Frank Adelstein [ATC-NY]
Digital Forensics Definition: “Tools and techniques to recover, preserve, and examine digital evidence on or transmitted by digital devices.” Devices include computers, PDAs, cellular phones, videogame consoles, copy machines, printers, …
Examples of Digital Evidence • Threatening emails • Documents (e.g., in places they shouldn’t be) • Suicide notes • Bomb-making diagrams • Malicious Software • Viruses • Worms • … • Child pornography (contraband) • Evidence that network connections were made between machines • Cell phone SMS messages
Facts (or: Why Digital Forensics?) • Deleted files aren’t securely deleted • Recover deleted file + when it was deleted! • Renaming files to avoid detection is pointless • Formatting disks doesn’t delete much data • Web-based email can be (partially) recovered directly from a computer • Files transferred over a network can be reassembled and used as evidence
Facts (2) • Uninstalling applications is much more difficult than it might appear… • “Volatile” data hangs around for a long time (even across reboots) • Remnants from previously executed applications • Using encryption properly is difficult, because data isn’t useful unless decrypted • Anti-forensics (privacy-enhancing) software is mostly broken • “Big” magnets (generally) don’t work • Media mutilation (except in the extreme) doesn’t work • Basic enabler: Data is very hard to kill
Privacy Through Media Mutilation or or or forensically-secure file deletion software (but make sure it works!) degausser
Digital Forensics Process • Identification of potential digital evidence • Where might the evidence be? • Which devices did the suspect use? • Preservation and copying of evidence • On the crime scene… • First, stabilize evidence…prevent loss and contamination • If possible, make identical copies of evidence for examination • Careful examination of evidence • Presentation • “The FAT was fubared, but using a hex editor I changed the first byte of directory entry 13 from 0xEF to 0x08 to restore ‘HITLIST.DOC’…” • “The suspect attempted to hide the Microsoft Word document ‘HITLIST.DOC’ but I was able to recover it without tampering with the file contents.” • Legal: Balance of need to investigate vs. privacy
“Traditional” Digital Forensics • Pull the plug • “Image” (make bit-perfect copies) of hard drives, floppies, USB keys, etc. • Use forensics software to analyze copies of drives • Investigator typically uses a single computer to perform investigation in the lab • Present results to client, to officer-in-charge, court
Traditional: Where’s the evidence? • Undeleted files, expect some names to be incorrect • Deleted files • Windows registry • Print spool files • Hibernation files • Temp files (all those .TMP files!) • Slack space • Swap files • Browser caches • Alternate or “hidden” partitions • On a variety of removable media (floppies, ZIP, Jazz, tapes, …)
But Evidence is Also… • In RAM • “In” the network • On machine-critical machines • Can’t turn off without severe disruption • Can’t turn them ALL off just to see! • On huge storage devices • 1TB server: image entire machine and drag it back to the lab to see if it’s interesting? • 10TB?
Next Generation: Needs • Broad: • Better design, better software • Yes, some of it is engineering (and hacking) • Someone has to do it • Better vision, application of ‘real’ CS to problems • More specific: • Need for speed • Machine correlation • Machine profiling • Better auditing of investigative process • On-the-spot forensics: Triage • Live forensics • Network forensics • Specific tools for detection and remediation of malware • Phishing investigation • …
Next Generation: UNO • Better file carving • Forensic-aware OS components • In-place file carving • Forensic accountability • On-the-spot forensics • Distributed digital forensics
File Carving: Basic Idea one cluster unrelated disk blocks interesting file one sector header, e.g., 0x474946e8e761 (GIF) footer, e.g., 0x003B (GIF) “milestones” or “anti-milestones”
File Carving: Fragmentation header, e.g., 0x474946e8e761 (GIF) footer, e.g., 0x003B (GIF) “milestones” or “anti-milestones”
File Carving: Fragmentation header, e.g., 0x474946e8e761 (GIF) footer, e.g., 0x003B (GIF)
File Carving: Damaged Files No footer header, e.g., 0x474946e8e761 (GIF) “milestones” or “anti-milestones”
File Carving: Doing a Better Job • Better design • Faster • Distributed implementation • More flexible description of file types • Automatic generation of type descriptions • Patterns • Rule sets • Multiple-pass carving • Carve, “remove” validated files from block list, re-carve, hope that some fragmented files coalesce • Block-sniffing
File Carving: Block Sniffing header, e.g., 0x474946e8e761 (GIF) • Do these blocks “smell” right? • N-gram analysis • entropy tests • parsing
Better Software: File Carving: Scalpel • Two-pass design • Minimizes: • Reads • Seeks • Writes • Data copying • Memory usage • Doesn’t yet incorporate all of the carving wizardry we have in mind G. G. Richard III, V. Roussev, "Scalpel: A Frugal, High Performance File Carver," Proceedings of the 2005 Digital Forensics Research Workshop (DFRWS 2005), New Orleans, LA.
Some Scalpel Results (1) Tread + 238,270,750,000 bytes Big targets, large carve sizes, huge improvement (over 5 hours faster)
Some Scalpel Results (2) Tread + 117,622,357,936 bytes Big targets, large carve sizes, huge improvement (over 7 hours faster)
OS Support for Digital Forensics • Export raw disk devices across network for processing • Others: network block device (NBD) • Us: optimization • “In-place” file carving • Us: Export results from file carving as a filesystem, w/ minimal extra storage • Better auditing of investigative process • Us: “digital evidence bag”-aware filesystems
FUSE (Filesystem in User Space) Filesystem Implementation dd if=/evidence/DEC/img.dd of=copy.dd FUSE library read() user space C library C library kernel space Linux Virtual File System Interface (VFS) FUSE ext3 reiserFS
In-Place File Carving client applications scalpel_fs FUSE preview database nbd client local drive network nbd server remote drive Scalpel G. G. Richard III, V. Roussev, V. Marziale, “In-Place File Carving,” submitted to the Third Annual IFIP WG 11.9 International Conference on Digital Forensics, 2007.
Better Auditing • Want: Digital Evidence Bags • See: P. Turner, “Unification of Digital Evidence from Disparate Sources (Digital Evidence Bags),” DFRWS 2005 • See: Common Digital Evidence Storage Format (CDESF) working group, http://www.dfrws.org/CDESF/.
Better Auditing (2) … dd scalpel TSK FTK Applications (User space) Block-level Data Access Filesystem Data Access (Kernel) Operating System VFS Interface FDAM Digital Evidence Container FDAM Block Device Import/ Export DEC (DEB, AFF, Gfzip …) Audit Log Evidence Data G. G. Richard III, V. Roussev, "Toward Secure, Audited Processing of Digital Evidence: Filesystem Support for Digital Evidence Bags," Research Advances in Digital Forensics, Springer, 2006.
Bluepipe: On the Spot Digital Forensics Bluepipe Patterns Y. Gao, G. G. Richard III, V. Roussev, “Bluepipe: An Architecture for On-the-Spot Digital Forensics,” International Journal of Digital Evidence (IJDE), 3(1), 2004.
<BLUEPIPE NAME=”findcacti”> <!-- find illegal cacti pics using MD5 hash dictionary --> <DIR TARGET=”/pics/” /> <FINDFILE USEHASHES=TRUE LOCALDIR=”cactus” RECURSIVE=TRUE RETRIEVE=TRUE MSG="Found cactus %s with hash %h "> <FILE ID=3d1e79d11443498df78a1981652be454/> <FILE ID=6f5cd6182125fc4b9445aad18f412128/> <FILE ID=7de79a1ed753ac2980ee2f8e7afa5005/> <FILE ID=ab348734f7347a8a054aa2c774f7aae6/> <FILE ID=b57af575deef030baa709f5bf32ac1ed/> <FILE ID=7074c76fada0b4b419287ee28d705787/> <FILE ID=9de757840cc33d807307e1278f901d3a/> <FILE ID=b12fcf4144dc88cdb2927e91617842b0/> <FILE ID=e7183e5eec7d186f7b5d0ce38e7eaaad/> <FILE ID=808bac4a404911bf2facaa911651e051/> <FILE ID=fffbf594bbae2b3dd6af84e1af4be79c/> <FILE ID=b9776d04e384a10aef6d1c8258fdf054/> </FINDFILE> </BLUEPIPE>
Distributed Digital Forensics 300GB 300GB 750GB 750GB V. Roussev, G. G. Richard III, "Breaking the Performance Wall: The Case for Distributed Digital Forensics,“ Proceedings of the 2004 Digital Forensics Research Workshop (DFRWS 2004), Baltimore, MD
Distributed Digital Forensics • Scalable • Want to support at least IMAGE SIZE / RAM_PER_NODE nodes • Platform independent • Want to be able to incorporate any (reasonable) machine that’s available • Lightweight • Horsepower is for forensics, not the framework—less fat • Highly interactive • Extensible • Allow incorporation of existing sequential tools • e.g., stegdetect, image thumbnailing, file classification, hashing, … • Robust • Must handle failed nodes smoothly
RAID: 504GB RAID: 504GB File Server File Server SCSI SCSI CPU: 2x1.4GHz CPU: 2x1.4GHz Xeon Xeon RAM: 2GB RAM: 2GB Switch Switch 1Gb 1Gb 96 96 - - port, 10/100/1000 Mb port, 10/100/1000 Mb 24 24 Gb Gb Backplane Backplane Node Node CPU: 2.4 GHz CPU: 2.4 GHz Pentium 4 4 RAM: 1 GB RAM: 1 GB Distributed Digital Forensics (3)
Live string search: “Vassil Roussev” Regular expression search: v[a-z]*i[a-z]*a[a-z]*g[a-z]*r[a-z]*a DDF: Results (1)
DDF: Results (2) • Stego detection using Stegdetect 0.5 under RH9 Linux on the cluster • Traditional: • 6GB image mounted using loopback device • find /mnt/loop –exec ./stegdetect ‘{}’ \; • 790 seconds == 13:10 minutes • Using the distributed framework • Stegdetect 0.5 code incorporated into framework • Detection against cached files • “STEGO” command (after IMAGE/CACHE) • 82 seconds == 1:22 minutes • 9.6X faster with 8 machines • CPU bound operation
DDF: To Do List • User interface! (unless you love Putty)
DDF: To Do (2) • Case persistence • Secure support for overlapping cases • Better fault tolerance • Intelligent caching schemes to support larger images • Collaboration with colleagues (you?) working in: • Image analysis/classification • Speech recognition • More stego • Other CPU horsepower-intensive, forensics-applicable stuff • We provide cycles…you provide…
Current: Live Forensics • Physical memory dumps • Hard to do when adversarial OS is present • Via USB hacking? • Firewire proof of concept developed by Maximillian Dornseif • Defeating process hiding techniques, e.g., FU “rootkit” • Check OS components from many angles • Remnants of applications (executed) past… • e.g., instant messenger fragments • e.g., recent invocations of process hiding • e.g., fingerprints of recently executed (or executing) malware
Conclusion: Lots of Work To Do • Benevolent hacking (engineering) meets science • Desperately need methods for pipelining investigative process • Live forensics critically important • volatile computing • whole disk encryption • hardware-based whole disk encryption! • nasty malware
Conclusion (2) • Arguably, almost any field in CS can collaborate • All media handling needs work • Algorithms for dealing with huge, partially-organized datasets • Attribution • Correlation • Profiling • Document similarity measures • Databases • High-performance computing • OS Internals
Random Bedside Reading… • http://www.dfrws.org (Digital Forensics Research Workshop) • http://www.ijde.org/ (International Journal of Digital Evidence) • F. Adelstein, “Live Forensics: Diagnosing Your System Without Killing it First,” Communications of the ACM, February 2006. • M. A. Caloyannides, Privacy Protection and Computer Forensics, Second Edition, 2004. • B. Carrier, File System Forensic Analsis, Addison-Wesley, 2005. • B. Carrier, “Risks of Live Digital Forensics Analysis,” Communications of the ACM, February 2006. • E. Casey, Digital Evidence and Computer Crime, Academic Press, 2004. • J. Chow, B. Pfaff, T. Garfinkel, M. Rosenblum, “Shredding Your Garbage: Reducing Data Lifetime Through Secure Deallocation,” 14th USENIX Security Symposium, 2005. • M. Geiger, “Evaluating Commercial Counter-Forensic Tools,” 5th Annual Digital Forensic Research Workshop (DFRWS 2005), New Orleans, 2005. • G. G. Richard III, V. Roussev, "Next Generation Digital Forensics," Communications of the ACM, February 2006. • G. G. Richard III, V. Roussev, “Digital Forensics Tools: The Next Generation,” invited chapter in Digital Crime and Forensic Science in Cyberspace, IDEA Group Publishing, 2005. • A. Schuster, “Searching for Processes and Threads in Microsoft Windows Memory Dumps,” 6th Annual Digital Forensic Research Workshop (DFRWS 2006), West Lafayette, IN, 2006. • S. Sparks, J. Butler, “Raising the Bar for Windows Rootkit Detection,” Phrack Issue # 63. • G. Hoglund, J. Butler, “Rootkits: Subverting the Windows Kernel,” Addison-Wesley, 2005.
? Presentation available: http://www.cs.uno.edu/~golden/teach.html golden@cs.uno.edu Security Lab (NSSAL): Math 322