220 likes | 381 Views
Internet Measurement Conference 2006. A Multifaceted Approach to Understanding the Botnet Phenomenon. Moheeb Abu Rajab, Jay Zarfoss , Fabian Monrose , Andreas Terzis Computer Science Department Johns Hopkins University. Outline. Introduction Working of Botnet Measuring of Botnet
E N D
Internet Measurement Conference 2006 A Multifaceted Approach to Understanding the BotnetPhenomenon Moheeb Abu Rajab, Jay Zarfoss, Fabian Monrose, Andreas Terzis Computer Science Department Johns Hopkins University
Outline • Introduction • Working of Botnet • Measuring of Botnet • Result and Analysis • Comments
Botnet • Very little known about the behavior of these distributed computing platforms. • model the botnet life cycle • The term botnet is used to define networks of infected end-hosts, called bots, that are under the control of a human operator commonly known as botmaster. • While botnets recruit vulnerable machines using methods also utilized by other classes of malware, their defining characteristic is the use of command and control (C&C) channels.
Botnet (cont’d) • Channels • IRC, Internet Relay Channel • was originally designed to form large social chat rooms • HTTP • P2P • While other class of malware were mostly used demonstrate technical prominence among hackers, botnets are used for illegal activities. • A multifaceted measurement approach to capture the behavior and impact of botnets • distributed malware collection (binary) • IRC tracking (live botnet) • DNS cache probing
Botnet Life Cycle (authenticate) defining characteristic (authenticate) resolving the DNS name of IRC server (instead of using hard-coded IP) actual bot binary shell code - remotely exploiting software vulnerabilities - social engineering
Measurement Methodology • Three Distinct Phases • Malware Collection • Collect as many bot binaries as possible • Binary analysis via gray-box testing • Extract the features of suspicious binaries • Longitudinal trackingofIRCbotnets • ThroughIRCandDNStrackers • Track how bots spread and its reach
Infrastructure Deployment 1 Large Local darknet. 14 distributed nodes (PlanetLabtestbed). 1 Honeynet 1 Download Station 1 Gateway 1 local IRC server IRC trackers (drone) DNS probers Use of 10 different class A (/8) darknet IP spaces. Darknet:denoteanallocatedbutunusedportionoftheIPaddressesspace.
Malware Collection • Nepenthes (on PlanetLab) mimics the replies generated by vulnerable services in order to collect the first stage exploit. • Nepenthes is a low interaction honeypot • a framework for large-scale collection of information on self-replicating malware in the wild, emulating only the vulnerable parts of a service • Modules in nepenthes • emulate vulnerabilities • download files – done by the Download Station • submit the downloaded files • shellcode handler
Malware Collection (cont’d) • Honeynets also used along with nepenthes • ensure catching exploits missed by nepenthes • These failures are most likely due to the responder’s inability to mimic unknown exploit sequences or to parse certain shellcodes. • Running unpatched instances of Windows XP in a virtualized environment (VMware) with static private-space IP. • One infection allowed and connections with unique IRC servers • Binaries (from nepenthes or honeynets) are sent to analysis engine for graybox testing.
Malware Collection (cont’d) • Gateway • Forwards traffic to 8 /24, daily rotating to cover the whole darknet (NAT) • Firewall (SNORT) • Prevent outbound attacks & self infection by honeypots • Only 1 infection in a honeypot
Binary Analysis (grey box testing) • They use graybox analysis to extract the features of suspicious binaries (regardless of the mechanism by which they were collected). • Phase 1: Creation of a network fingerprint • fnet = <DNS, IPs, Ports, scan> • DNS requests, destination IPs, Contact Ports, Contact Protocols, default scanning behavior (e.g n=20 destination/port/monitored period) • Phase 2: Extraction of IRC-related features • firc = <PASS, NICK, USER, MODE, JOIN> • initial password, nickname and username, the particular modes set, and which IRC channels are joined (with associated channel passwords)
Learn a botnet dialect • Taken together, fnet and firc provide enough information to join a botnet in the wild. • not enough • They make the bot connect to their local IRC channel. • Force bot to join a local IRC server ( fake Botmaster) • Use a query engine to learn the botnet “dialect”, extracting command-response templates.
Longitudinal Tracking • IRC tracker (Drone) • Connects to a real IRC channel using fnet and firc. • Pretends to dutifully follow any commands from the botmaster, and provides realistic responses to her commands. • need to be intelligent enough • filter inappropriate information included in the template • DNS Tracking • Bots issue DNS queries to resolve the IP addresses of their IRC servers (~800,000 name servers are used) • Each DNS name of a newly detected IRC server is added to the list of servers to be probed. • They probe the caches of all DNS and record any cache hits.
Results and Analysis • Collection period starts 1 Feb 2006 • Darknet Traffic traces > 3 months • IRC logs (honeynet, drones) > 3 months • More than 100 botnet IRC channels • Result of DNS cache hits from tracking 65 IRC servers more than 45 days • Captured • 318 malicious binaries.
- ~27% of the incoming SYN is contributed by known botnet spreader • 76% to target ports (135, 139, 445, 3127) • - >70% succeed to send shellcode Botnet Traffic Share Botnet Spreader: any source that successfully completed an exploitation transaction and delivered a bot executable.
(Top Level Domain) DNS Tracker Results - Total 65 IRC server identified. - 11% of the name servers involved in at least one botnet activity. - 29% of the .com servers had at least 1 cache hit. Geographic location of the DNS cache hits for one of the tracked botnets. The star indicates the location of the IRC server.
Bot Scan Method • Type I (34 of 192 IRC bots) 17% • worm-like scanning • continuously scan certain ports following a specific target selection algorithm • Type II (158 of 192 IRC bots ) 83% • variable scanning behaviors • only scan after receiving a command over C&C channel
Botnet Growth – DNS and IRC Different bots have different growth pattern, and they can be shown by DNS and IRC views.
Botnet Structure • Of 318 malicious binaries, 60% were IRC • 70% of the botnets has single IRC server. • Bridged 30% ( 25% public servers) • Two Servers 50% • Unrelated botnets had similar naming conventions, channel names, user IDs. • In many cases, these botnets seem to belong to the same botmaster(s). • Several instances where a selected group of bots were commanded to download an updated binary, which subsequently moved the bots to a different IRC server.
Size and Lifetime Bots generally do not stay long onthe IRC channel broadcast join/leave information for members on the channel
Botnet Software Taxonomy AV: Anti-Virus FW: Firewall
Comments • A measurement methodology • How to capture a botnet’s binary? • How to find the characteristic of a binary? • Build a system over honeypot. • Only focus on RPC and DNS analysis • They did lots of analysis after capturing the bot, • how about evaluate the methodology?