290 likes | 304 Views
Real-Time RAT-based APT Detection. Our Focus. Initial Compromise. Gaining Foothold. Lateral Movement. High Value Asset Acquisition. Malware (e.g. RAT). Network scan. Phishing. Exploit vulnerability. Code Repo. Malware propagation. Malicious Web. Exploit browser. Database.
E N D
Our Focus Initial Compromise Gaining Foothold Lateral Movement High Value Asset Acquisition Malware (e.g. RAT) Network scan Phishing Exploit vulnerability Code Repo Malware propagation Malicious Web Exploit browser Database Behavior-based Malware Detection Victim Attacker Behavior based Malware detection Provenance based Analytics CONFIDENTIAL • Design a detection mechanism that targets at the key step (gaining foothold)in the APT life-cycle
APT Malware • Remote Access Trojan (RAT) • Based on the study of 300+ APT whitepapers, RAT is a core component in an APT attack, and >90% are Windows based. • Allows an adversary to remotely control a system • A complex set of potentially harmful functions (PHFs) • E.g., keylogger, screengrab, remote desktop, remote shell, audiograb • A Windows RAT typically embodies10~40 PHFs.
Issues with FAROS Kafka Topics • Kafka A and Kafka B not usable • Due to the unstable FAROS tool, TA5.1 suggests not consuming either Kafka A or Kafka B produced by FAROS • Stretch Goal Topic became available very late • Data errors found and FAROS re-produced the topic on 10/7 • Even with those issues, finally we finished our ingestion of the topic, submitted the initial report to TA5.1, and received positive feedback.
How to Figure Out the Attack Graph • Data Reduction • 71M recordsin Stretch topic; 30 mins processing time • 529 processes in total; 22 processes (4%) identified involved in malware activities • Three processes were reported by our RAT detector • Profile.exe (2) matched with the remoteshell signature • Prodat.exe matched with the screengrab signature • Perform backtracking • Based on the artifacts (network ip/port connected, files created) and the pid-ppid relationship, we identify all relevant processes.
Breakdown of the Attack (1) • The attack begins with triggering an executable "C:\Users\User\Downloads\profile.exe" at Sep. 27 18:12:06 GMT.
Breakdown of the Attack (2) • At 18:13:33, the malware "profile.exe" invoked "cmd.exe“, which in turn invoked another malware "C:\Users\User\Downloads\prodat.exe" at 18:13:58. However, the current data traces do not allow us to determine how the malware gained foothold. • This malware mainly did screengrab, and saved the results in "C:\Users\User\Downloads\proout.png". • And then this file was read and sent out by profile.exe to 129.55.12.167:19985.
Breakdown of the Attack (3) • At 18:16:58, the malware "profile.exe" invoked cmd.exe again to run hostname.exe, whoami.exe, and netstat.exe to collect sensitive information. The results were written to a log file "C:\Windows\Temp\1283.log
Breakdown of the Attack (3) – Cont’d • At 18:19:44, the malware "profile.exe" invoked cmd.exe" again, which in turn executed the malware “proup.exe“ • "proup.exe" then sent "1283.log" and initiated TCP connection to the attacker machine 129.55.12.167:1050 for data exfiltration. • At 18:21:42, “burnout.bat“ was executed for the cleanup.
Breakdown of the Attack (4) • At 20:03:55, a Firefox process ("firefox.exe") was launched, which invoked another Firefox process subsequently. Then the latter Firefox was probably compromised, which downloaded another malicious executable with the same name "profile.exe" from IP address 200.200.200.10:20480 (site lariat.world.net) at 20:06:41, and also saved it as "C:\Users\User\Downloads\profile.exe"
Breakdown of the Attack (4) – Cont’d • The malware then started running at 20:09:02 and soon invoked cmd.exe • The cmd.exe executed both "systeminfo.exe" and "tasklist.exe" to collect system information and currently running task list. • The results were saved in the file named "rfeed.dat". • Then "profile.exe" sent the data file out to 129.55.12.167:19985. • Finally, at 20:17:33, "profile.exe" executed "burnout.bat“ to perform the cleanup work.
Our Approach: Fine-Grained, Evasion-Resilient and Real-time RAT Detection
Our Work • What is going on • Implement a fine-grained, evasion-resilient and real-time detection system of RATs • Specifically, we detect if malicious functionalities are present in the system call traces of a process. • Why not provenance-based causality analysis • FAROS does not provide usable provenance information for now. • Data missing: provenance node, netflow object node, file object node. • What is next: • Design a system for both real-time APT malware detection and automatic causality analysis.
Overview • Observation • # of PHFs possibly embodied in a RAT is limited (10~40). • Core system calls and their orders required to exactly define a PHF are limited, and thus it is possible to identify all of them. • Core Idea • Fine-grained, evasion-resilient and real-time RAT detection • Determine if a program is a RAT by detecting its functionalities and examining its characteristics. Specifically, • Create signatures for each PHF possibly embodied in a RAT • Train a classifier based on the unique characteristics of RATs to discern between RATs and benign programs
Overview (Cont’d) • Advantages • Generated signatures are finer-grained and semantics-aware. • Identify what activity is going on while detecting a RAT • Hard to evade unless attackers find new ways of implementing PHFs and have to do that for at least several major PHFs
Supervised learning Training data with ground truth Our Approach Design PHF1 Self-repeated gadgets identification and correlation analysis A PHF1 Trace 1 PHF1 Trace 2 … PHF1 Trace n Signatures for each PHF, for determining the functionality B C RAT traces … … Module 1: Traces based signature generation system (offline) PHFm PHFmTrace 1 PHFmTrace 2 … PHFmTrace n U Gadgets identification and correlation analysis V W Feature generation & selection Classifier signatures for differentiating benign from malicious Benign traces Characteristic analysis Supervised learning Signature matching Score 1 PHF1 Sig Module 2: Real-time RAT detection system System call traces NtGdiCreateCompatibleDC NtGdiBitBlt NtCreateSection NtQueryInformationProcess NtCreateThread NtResumeThread PHF2 Sig Score 2 Malicious Score … … Score n-1 PHFn-1 Sig Score n Classifier Sig
PHF Signature Generation ⁞ NtUserGetKeyboardState NtUserMapVirtualKeyEx NtUserGetForegroundWindow ⁞ • Observation 1: • Most malicious activities such as keylogger and screengrab require frequent probes of input devices to collect coherent and meaningful user inputs. • Such characteristic is reflected in the trace that there exist small gadgets self-repeated multiple times. • Insight • Those gadgets can be automatically extracted from the traces and then potentially used for defining the malicious activities. NtUserGetKeyboardState NtUserMapVirtualKeyEx NtUserGetForegroundWindow ⁞ NtUserGetKeyboardState NtUserMapVirtualKeyEx NtUserGetForegroundWindow ⁞
PHF Signature Generation – cont’d • Observation 2: • Multiple RATs tend to implement a PHF in the same way at the system call level. And the ways to implement a PHF are quite limited. • Insight: • Leverage sequence alignment algorithms borrowed from bioinformatics to identify regions of similarity in system call sequences. • Such similarity regions typically correspond to the execution of similar code. • Build finite automata to model the similarity regions as our signatures ⁞ NtProtectVirtualMemory NtProtectVirtualMemory NtGdiCreateCompatibleDC NtGdiCreateCompatibleBitmap NtGdiBitBlt NtGdiDeleteObjectApp NtGdiExtGetObjectW NtProtectVirtualMemory NtProtectVirtualMemory ⁞ ⁞ NtDelayExecutionNtDelayExecution NtGdiCreateCompatibleDC NtGdiCreateDIBSection NtGdiStretchBlt NtGdiDeleteObjectApp NtGdiExtGetObjectWNtDelayExecutionNtDelayExecution ⁞
Classifier Signature Generation • Selected features (also unique characteristics of RATs) • Persistence • Modifies auto-execute functionality by setting/creating a value in the registry • Environment Awareness for Reconnaissance and Evasion • Reads the active computer name, or the machine identifier “MachineGuid” • Tries to evade analysis by sleeping many times and for a long time (>2min) • Spyware/Information Retrieval • Accesses potentially sensitive information from local browsers • Queries sensitive IE security settings • Anti-Detection and Being Stealthy • Sets the process error mode to suppress error box • Checks for the presence of an antivirus engine
Classifier Signature Generation – cont’d • Selected features – cont’d • System Destruction • Opens file with deletion access rights probably for cleanup after attack • Unusual Characteristics • Spawns a lot of processes • Creates/touches files in windows system directory and registry • Running in Background • No window, menu, or any visible components • No human interactions • Actions initiated remotely, rather than initiated locally All those features can be observed in system call traces (either system call name or argument).
Classifier Signature Generation– cont’d • Training set and selected features • System call traces of RATs • System call traces of popular benign applications (Winscap, Skype, notepad, …) RAT traces (Poison Ivy, Pandora, Darkcomet, …) Classifiers for discerning between RATs & benign Benign traces (Winscp, Skype, notepad++, quicktime player, …)
Previous Malware Detection Methods • Main idea of the state-of-the-art work • Identify security-sensitive syscalls (e.g., network connections-related) • Use data dependency to connect more syscalls, and hence construct a path ending at one security-sensitive syscall • Use such a path as detection signature • E.g., the graph represents the signature graph generated. And the red nodes denote security-relevant system calls. Then whenever a path like the blue one and the yellow one is matched, the system would report the unknown program as malware.
Previous Malware Detection Methods • Main problem 1: false positive • In the real world, RATs and benign programs share lots of similar behavior. • It is not reasonable to judge a program just based on a similar behavior (i.e., a matched path) without awareness of the semantics corresponding to that path. Either the blue path or the yellow one could represent benign behavior!
Previous Malware Detection Methods • Main problem 2: evadable by RATs Trace 1: NtUserGetDC NtGdiGetDeviceCaps … NtConnectPort NtRequestWaitReplyPort NtRequestWaitReplyPort ⁞ Extract data dependency between system calls. Build a signature graph for each malware sample based on dependency. Trace 2: NtGdiCreateCompatibleDC NtGdiBitBlt … NtCreateSection NtQueryInformationProcess NtCreateThread NtResumeThread ⁞ NtRequestWaitReplyPort NtRequestWaitReplyPort NtCreateSection NtResumeThread The system calls marked in red will be ignored since they neither are security-sensitive syscalls nor have data dependency with security-sensitive syscalls. NtQueryInformationProcess NtCreateThread … NtConnectPort
Previous Malware Detection Methods • Main problem 2: evadable by RATs (cont’d) • RATs often stay inactive for a long time before sending out the data already collected. That is, the data collection actions are not necessarily followed by security-relevant system calls corresponding to abnormal network connections. • In this case, the data collection behavior will not be identified by the signatures generated based on the security-related syscalls. Thus, the previous approaches could be evaded. • Actually, the ignored syscalls could exactly be generated by the data collection behavior • E.g., the ignored syscalls actually represent part of the screengrab behavior (the right graph)
Conclusion • We proposed a fine-grained, evasion-resilient and real-time RAT detection approach. • Our approach has been evaluated to work well in the engagement 1.