570 likes | 696 Views
Intrusion Analysis by Reconstructing System State. Ashvin Goel University of Toronto Joint work with Kenneth Po, Kamran Farhadi Wu-chang Feng and the Forensix group at PSU. Motivation. “Nothing is certain but death, taxes, and 0wned machines”
E N D
Intrusion Analysis by Reconstructing System State Ashvin Goel University of Toronto Joint work with Kenneth Po, Kamran Farhadi Wu-chang Feng and the Forensix group at PSU
Motivation • “Nothing is certain but death, taxes, and 0wned machines” • Exploits in software, security policies, policy enforcement • Compromised accounts • Employees gone bad • Sometimes, you need to quickly find out exactly what happened on a system • Current forensic techniques inadequate • Incomplete audit information • Reconstruction process is manual and error-prone
The Forensix approach • Record all system activity, automate replay • “Computer TiVo” • Enable fast and accurate forensic analysis of compromised machine • What about the costs? • Forensic investigator time is expensive • Computing and storage resources are cheap and plentiful • $40 ~ 6 month replay log (small web server) • 10-20% performance degradation • Cost proposition becomes more favorable every day
Issues • Auditing accuracy (Races and proper event attribution) • Page cache auditing to disambiguate write() races • Permeating attribution throughout kernel • Auditing overhead • Elimination of full read() logging • Batching and other kernel optimizations • Webstone benchmark => ~20% degradation • Reconstruction queries for intrusion analysis
Intrusion Analysis • Helps understand cause of attack • After intrusion detection phase • Helps minimize after-effects of intrusions • Allows accurate assessment of extent of damage • Retrieval of uncorrupted data • Retrieval of attack code • Replay of system activities related to attack • Restarting services as soon as possible • Helps determine attack signatures • Can improve intrusion detection process
Analysis Requirements • Complete - analysis of all intrusions • Predictable - analysis shouldn't disturb evidence • Flexible - comprehensive views of system state • Replay bug - reconstruct specific activities • Dependency - express relations between activities • Real-time - iterative process • Performance - low overhead
Complete Analysis • Capture system call activity • Host intrusions must manipulate processes, files • Requires making system calls • Assumptions • Kernel is not compromised • Disable writes to kernel memory
Public network Forensix Architecture Application Server Target System Operating System Authenticated System-Call Logging Facility Provides complete, authenticated service Private network Logging Pinhole Append-Only Files Backend Storage System Batched Record Processing Database Backend Forensic Analysis • System-call data analyzed on backend system • Provides completeness and predictability
Flexibility? • System call data is too low level • Deals with kernel entities (FDs, PIDs) • Gives state change information • Humans are interested in user-visible system state • User-level entities (files, process names) • Need system state information at a given time/interval • Reconstruction is linear, complicated and slow • System semantics are complicated • Process identifier can have different names (e.g., execve) • File descriptor can have different names (e.g., close, dup) • Analysis tools are hard to write and slow
Example #1 • User query • List all processes that existed in the last hour • Query over raw audit data • Process all fork and wait audit events to determine lifetimes of each process on the system • Select those processes that existed in the last hour • Improvement • Time-indexed process table
Example #2 • Suspected ptrace-execve race that created a new setuid binary yesterday • User query • Compare setuid root binaries of today to a few days ago • Find files with owner=O and permission=P at time=T
Example #2 • Query over raw audit data • Find all files owned by O at time T • For each file created (mkdir, mknod, create, symlink), find last event (chown) before T that set owner to O • Remove files that were deleted before T (rmdir, unlink) • Find all files with permission P at time T • For each file created (mkdir, mknod, create, symlink), find last event (chmod) before T that set permission to P • Remove files that were deleted before T (rmdir, unlink) • Return intersection of above two queries • Problem • All events must be examined (only last one matters)
Example #3 • Suspected rootkit (rkid.tar.gz) and local root exploit (xpl.tar.gz) packages installed on machine at some point in time • Unpack into directories named rkid and xpl • User query • Find the contents of directory=D at time=T • Query over raw audit data • Find each file created (mkdir, mknod, create, symlink, link), updated (rename), or removed (rmdir, unlink) from directory=D before time=T • Problem (same as Example #2, replay all events)
Other examples • Tracking modifications to /etc/passwd • Find the path name of a file whose inode=I at time=T • Return all modifications done on inode=I • Privelege escalations • Find processes whose effective user id=E between Ts and Te
System State Mappings • Map kernel entities to user-visible system state • Track changes to this mapping over time • Table of “object and attribute lifetimes” • Allows analysis tools to reuse reconstructed state • Mappings constructed upon audit insertion to backend database • Lifetimes stored in “interval tables”
Interval tables • Each table has ID, begin and end time • Complexity of system semantics interpreted when mappings are constructed • Analysis queries written in SQL • Without iteration or recursion • Easier optimization of queries
Constructing Mappings • Mappings are constructed for a time interval • Need at least two queries • New rows created with begin time • Update current rows with end time • Construction is idempotent • Allows overlapping construction, deletion, recreation • Reconstruction must be in time order
Mapping Issues • Each kernel entity should be unique • PID, INODE have to be unique • PID • Add generation number during backend processing • Generation number initialized to current time • INODE • Persistent generation number available from file system • Generation number is incremented when inode is reused
Example #2 revisited • Find files with owner=O and permission=P at time=T SELECT f.inode FROM file_owner f WHERE f.owner = O AND f.permission = P AND T BETWEEN (f.ts, f.te)
Example #3 revisited • Find the contents of directory=D at time=T SELECT i.file_name FROM inode i WHERE i.parent_inode = D AND T BETWEEN (i.ts, i.te)
Types of Tools • File-Access Tracker • Shows files accessed or modified in a time interval • IO Tracker • Replays IO performed by processes • Reconstructs contents of files and directories • Dependency Tracker • Displays dependencies between processes and files
File-Access Tracker • General query to display access or modification times of files • Uses two queries • Calls that use paths (rename, unlink, etc.) • Calls that use file descriptors • Shows all names of accessed or modified files • Hard links, removes, renames, etc. • Filtering to limit results • Event type (i.e. create, open), time interval, last access, file names, file attributes, process names, process attributes • Implemented via a join of interval tables and underlying Forensix tables
SELECT i.inode+, max(e.time) FROM event e, fd_mapping f, inode_mapping i WHERE e.syscall in (read*, write*, fchown, fchmod, truncate*) AND f.pid+ = e.pid+ AND f.fd = e.fd AND f.i_id = i.id AND e.time BETWEEN f.begintime AND f.endtime AND e.time BETWEEN starttime AND finishtime GROUP BY i.inode+; SQL code for finding files modified via file descriptors
IO Tracker • Process IO tracker • List all I/O of a process • Useful for recreating shell session (w/ descendants) • Use process interval table to get PID+ given a name • Use inode interval table to get inode of terminal • File IO tracker • List all I/O for a file • Useful for reconstructing access to and modification of files • Use inode interval table to get inode of file
Process IO Tracker INSERT INTO tmp_event (id, time) SELECT e.id, e.time FROM event e, fd_mapping f, tmp_pid p, inode_mapping i WHERE e.syscall in (write*) AND f.pid+ = e.pid+ AND f.pid+ = p.pid+ AND f.fd = e.fd AND f.i_id = i.id AND e.time BETWEEN f.begintime AND f.endtime AND i.path = path; (e.g., /dev/pts0) SELECT data FROM io, tmp_event e WHERE io.parent = e.id ORDER BY e.time; Find descendants of PID to track user sessions
File IO Tracker • File tracker • Similar to process IO tracker, but with inode instead of PID • Obtains inode at recreation time • Recreates opens, writes, seeks and truncates • Does not currently track memory-mapped writes
Directory Tracker • List paths in inode interval table with prefix that matches directory • See example #3
Dependency Tracker • Used to determine contamination of system by malicious activity • Process to process dependencies • Fork or execve • File to process dependencies • Process reads from file • Process to File dependencies • Process writes to file • File to file dependencies (bidirectional) • Two file names refer to same file (link, chroot)
Backward and Forward Tracking • Need one or more detection points • Backward tracking • Shows sequence of states that lead to a detection point • Forward tracking • Shows sequence of states affected by a detection point • Needs filters for additional pruning
Setup • Honeypot system • Redhat 7.1 (seawolf) distribution • Vulnerabilities • Httpd with SSL • Wu-ftpd • Sendmail • Ptrace • 2600+ AMD Athlon frontend • Intel Pentium 2.4 GHz backend, 512 MB, 120 GB
Analysis of FTP Intrusion #1 /bin 1 /bin/netstat 05-17 15:54:17 /etc 28 /etc/passwd 05-17 15:52:04 /etc/group 05-17 15:51:29 ... /ftp 1773 /home 6 /incoming 3 /root 1 /root/.bash_history 05-17 16:43:25 /sbin 1 /sbin/ifconfig 05-17 15:54:18 /spool 5 /tmp 328 /var 29 /usr 3 /usr/bin/killall 05-17 15:54:22 /usr/bin/chfn 05-17 16:04:14 /usr/bin/chsh 05-17 16:06:04 Files modified by root, grouped by directory
Analysis of FTP intrusion #2 /bin 74 /bin/kill 05-12 17:11:58 /bin/ps 05-12 17:11:46 /dev 3 /etc 84 /etc/passwd 05-12 17:11:20 /home 11 /lib 588 /root 3 /root/.bash_history 05-12 18:40:32 /sbin 175 /sbin/ldconfig 05-12 17:12:09 /tmp 26 /var 452 /usr 26 /usr/bin/killall 05-12 17:11:46 Files modified by root, grouped by directory
Performance Under Heavy Load • Storage needs under heavy load • 8-10 GB per day • Mapping tables can be purged and recreated
Related Work • System call monitoring • USTAT: uses state transitions to detect intrusions • Tripwire, Coroner's Toolkit, Sleuth Kit • Detects file modifications • Recovers deleted files from unallocated disk blocks • Sebek • Captures write calls to replay attacker's keystrokes • ReVirt, Backtracker • LIDS • Secure front-end operation • Elephant file system • Provides file system snapshots
Conclusions • Empower the user when system is compromised • Provide a complete picture of the extent of damage • Retrieve uncorrupted data • Provide hints to harden system • Implement tools to allow analysis of large data • Create mappings • Between kernel entities and user-visible state • Simplifies tools • Allows intrusion analysis in near real-time
Current status • Project page • http://syn.cs.pdx.edu/projects/4N6 • Source code availability • http://forensix.sourceforge.net • Sample queries • Replay Shell (demo), Process Tree, Privilege Escalation
Future Work • Reduce mapping time • Reduce/filter the amount of data collected • Apply tools for intrusion detection
Future work • Incorporating functionality from other forensic tools • Full audit trail allows Forensix to superset other tools • Selective undo • “Back to the Future” • Automate system restoration
Example of Mapping Construction /* Create new row. Fill begin time */ INSERT IGNORE INTO inode_mapping (inode+, path, type, begintime) SELECT p.inode+, p.dst_path, p.type, e.time FROM event e, path p WHERE e.id = p.parent AND e.syscall IN (mknod, mkdir, link, rename, symlink) AND e.returncode >= 0 AND e.time BETWEEN mapping_starttime AND mapping_finishtime; /* Update end time */ UPDATE inode_mapping i, event e, path p SET i.endtime = e.time WHERE e.id = p.parent AND e.syscall IN (unlink, rename, rmdir) AND i.inode+ = p.inode+ AND i.path = p.src_path AND i.endtime IS NULL AND e.returncode >= 0 AND e.time BETWEEN mapping_starttime AND mapping_finishtime;
Issues With Constructing Mappings • Exit and close can be implicit • E.g., process killed by signal • Examine status of parent's wait system call • Allows building queries based on signal information • State of currently running processes not known • Open file descriptors • Files currently on system • Vfork seen after process starts executing • Inode number obtained before or after system call • Race condition • Remotely mounted files • Inode numbers are not unique