1 / 15

Wide-scale Botnet Detection and Characterization

This paper presents a wide-scale botnet detection and characterization approach using anomaly-based passive analysis algorithms. The system can detect IRC botnet controllers running on any random port without the need for known signatures or captured binaries.

ungar
Download Presentation

Wide-scale Botnet Detection and Characterization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin

  2. Introduction • The Master host is the computer used by the perpetrator and is used to issue commands that are relayed to the bots via the controller (often IRC servers).

  3. Contributions • The development of an anomaly-based passive analysis algorithm that is able to detect IRC botnet controllers. • Achieving less than 2% false-positive rates. • Able to detect IRC botnet controllers running on any random port without the need for known signatures or captured binaries.

  4. Data Collection • Transport layer flow summary data are used instead of packet-level analysis • Reduces some privacy protection concerns. • Reduces the amount of data to be processed. • Scalable to large networks, since all network devices can generate flow data. • Flow record data are collected from a large number of geographically and end-point diverse circuits to a central location.

  5. Detection of Botnet Controllers • Aggregation of trigger events, identification of hosts with suspicious behaviour, and selection of flows. • Identification of Candidate Controller Conversations. • Analysis of Candidate Controller Conversation records. • Validation of Controllers.

  6. Detection of Botnet Controllers • Reports of suspicious host activities are generated by internal upstream systems • Aggregate the trigger events • Search and fetch the flow records where the set of suspected hosts appear • Aggregation of trigger events, identification of hosts with suspicious behaviour, and selection of flows.

  7. Detection of Botnet Controllers • Identification of Candidate Controller Conversations. • Search flow records • Identify connections to typical IRC ports (e.g.6667,6668). • Identify connections to hub servers/ports periodically. • Identify connections to servers with similarity to a flow model for IRC traffic that represent typical command and control activity.

  8. Detection of Botnet Controllers • Analysis of Candidate Controller Conversation records. • Calculation of the number of unique suspected bots for a given remote server address/port. • Allow to focus on the larger botnets. • Calculation of the distances between the traffic to remote server ports and the model traffic Ns=4 is the number of statistics Nm=3 is the number of metrics (flows-per-address, packets-per-flow and bytes-per-packet) Xij observed traffic values of statistic j of metric i Mijmodel traffic values of statistic j of metric i

  9. Detection of Botnet Controllers • Analysis of Candidate Controller Conversation records. • Calculation of a heuristics score for a server address/port pairs that remain candidates for previous conditions (a) and (b). • Idle clients generate flow records that have certain patterns (IRC Ping-Pong messages). • Server uses both TCP and UDP on the suspect port. • Server appears to be serving significant p2p traffic (i.e. it has multiple peers on multiple service ports).

  10. Detection of Botnet Controllers • Validation of Controllers • Correlation with other available data sources (e.g. honeypot based detection). • Coordination with a customer for validation and mitigation. • Validation of domain names associated with services.

  11. Characterization of Botnets • Objective: Classify the activities of the bots in the presence of background noise traffic • Select the hosts we want to classify. • Examine their traffic and calculate the number of flow records to application-bound ports. • Traffic profile of a host • A vector of application-bound ports ranked by the number of flows observed.

  12. Characterization of Botnets • Similarity of two hosts S(i,j) with vectors vi and vj • S(i,j) є [0,1] , S(i,j)=S(j,i) • Similarity increases if a port number exists in both vectors • Similarity is a strictly decreasing function of the port rank

  13. Characterization of Botnets • Classification algorithm, given a set of hosts • Calculate the similarity for each pair of hosts and rank them with descending order. • For the pairs with similarities larger than a threshold go to next step. • For each pair of hosts, check if any of them is already grouped. • If none of the hosts in the pair is grouped, create a new group and calculate its traffic profile. • If one of the hosts is already grouped add the other host to the group. • As new hosts are identified, calculate their similarity to all of the existing groups and allocate them to the group with the highest similarity above the threshold.

  14. Quantitative Results • 376 unique controller IP addresses have been detected between 8/2006-2/2007. • Only 5 addresses were false positives. • 6 million unique IP addresses participating in malicious botnets have been discovered between 11/2005-5/2006. • Since then, about 1 million new bots per month are discovered. • Observed botnets are very dynamic in nature • The average bot stays 2-3 days on the same botnet controller

  15. Conclusions • Advantages of this approach • Entirely passive and invisible to operators • Scales to the largest of networks • Based on flow data analysis, which limits privacy issues • Has a false positive rate of less than 2% • Helps identify botnets that are most affecting real users (and customers) • Can detect botnets that use encrypted communications • Helps quantify size of botnets, identify and characterize their activities without joining the botnet

More Related