270 likes | 496 Views
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces. Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC (Dec, 2009) . Agenda. Introduction Objective Detecting Malicious Flux Networks Experiments Conclusion. Agenda. fast-flux domain names.
E N D
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC (Dec, 2009)
Agenda • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion
Agenda fast-flux domain names • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion Fast-Flux? At 2007 Malicious Fast-Flux Network
Malicious flux service networks • Be viewed as illegitimate content-delivery networks (CDNs) • The nodes of a malicious flux service network is called flux agents • Commonly used to host phishing websites, illegal adult content, or serve as malware propagation vectors
Related Work • Detecting fast-flux domain names • Characterized fast flux domains and the details of the classification algorithms • Limited to mainly studying fast-flux domains advertised through email spams
Approach • Novel and passive • Monitor the DNS queries and responses from the users to the RDNS, and selectively store information about potential fast-flux domains into a central DNS data collector • By deploying sensors in front of the recursive DNS (RDNS) ?
Agenda • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion • Focus on detecting malicious flux networks in- • the-wild • Passive detection benefit the accuracy of spam • filtering applications
Agenda • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion
Characteristics of Flux Domain Names • Short time-to-live (TTL) • The set of resolved IPs (i.e., the flux agents) returned at each query changes rapidly, usually after every TTL • The overall set of resolved IPs obtained by querying the same domain name over time is often very large • The resolved IPs are scattered across many different networks
Traffic Volume Reduction(F1)(1) • q(d) = (ti, T(d),P(d)) • DNS query performed by a user at time tito resolve the set of IP addresses owned by domain name d • T(d) • the time-to-live (TTL) of the DNS response • P(d) • the set of resolved IPs returned by the RDNS server
Traffic Volume Reduction(F1)(2) F1-a) seconds (i.e., 3 hours) F1-b) F1-c) Three Constraints in F1!
Periodic List Pruning(F2)(1) • Candidate flux domain name d • d = • : the time when the last DNS query for d was observed • : the total number of DNS queries related to d ever seen until • : the maximum TTL ever observed for d • : the cumulative set of all the resolved IPs ever seen for d until time • : a sequence of pairs • where
Domain Clustering(1) • A similarity (or proximity) matrix P = {sij}i,j=1..n that consists of similarities sij = sim(di, dj) • D = {d1, d2, ..dn},
Domain Clustering(2) • The hierarchical clustering algorithm takes P as input and produces in output a dendrogram, i.e., a tree-like data structure in which the leaves represent the original domains in D
Service Classifier (1) • Some features used to distinguish between malicious flux services and legitimate/non-flux services • Both passive and active features • Passive: directly extracted from the information collected by passive monitoring the DNS queries • Ex: Number of resolved IPs, • Active: need some external information to be computed • Ex: Country code diversity,
Service Classifier (2) • Employ the popular C4.5 decision-tree classifier to automatically classify a cluster Ci as either malicious flux service or legitimate/non-flux service
Agenda • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion
Collecting Recursive DNS Traffic • Two sensors in front of two different RDNS servers of large north American ISP • Between March 1 and April 14, 2009 • More than 4 million users • Monitor 2.5 billion DNS queries per day • Set the epoch E to be one day
Clustering Candidate Flux Domains(1) • Apply a single-linkage hierarchical clustering algorithm to group together domains that belong to the same network • Need 30 ~ 40 minutes per day and per sensor • Obtained 4000 domain clusters per day
Clustering Candidate Flux Domains(2) • Manually verified the quality of the results for a subset of the clusters obtained every day • With the help of a graphical interface • Ex: • NTP server pool in Europe, North America, Oceania, etc
Evaluation of the Service Classifier(1) • Statistical supervised learning approach • Label the cluster domains, • according to network prefix diversity, • cumulative number of distinct resolved IPs, • the IP growth ratio, , etc.
Evaluation of the Service Classifier(2) Classify between malicious flux network and non malicious flux network : Avg. TTL per domain : Number of domains per network : IP Growth Ratio
Can this Contribute to Spam Filtering?(1) • Check the intersection between domain name set from spam emails and domains from the malicious flux networks identified by the detection system
Agenda • Introduction • Objective • Detecting Malicious Flux Networks • Experiments • Conclusion
Conslusion • The detection system is based on passive analysis of recursive DNS (RDNS) traffic traces • Not limited to the analysis of suspicious domain names extracted from spam emails or precompiled domain blacklists • Benefit spam filtering applications