250 likes | 463 Views
All Your iFRAMEs Point to Us. Niels Provos Panayiotis Mavrommatis Google Inc. {niels, panayiotis} @google.com Moheeb Abu Rajab Fabian Monrose Johns Hopkins University {moheeb, fabian} @cs.jhu.edu. Introduction. What is a drive-by download and why should we even care?
E N D
All Your iFRAMEs Point to Us Niels Provos Panayiotis Mavrommatis Google Inc. {niels, panayiotis} @google.com Moheeb Abu Rajab Fabian Monrose Johns Hopkins University {moheeb, fabian} @cs.jhu.edu
Introduction • What is a drive-by download and why should we even care? • Wide scale attacks aimed at overwhelming computing resources become less prevalent • So exploitation of web + its services may be easier • Malware delivery: • Social engineering • Browser vulnerabilities + lure users to connect to malicious servers
Injection techniques • Exploit web servers via vulnerable scripting applications (ex. phpBB2 & InvisionBoard) • Gain direct access to the OS • Inject content to compromised websites (URLs or zero pixel iFrames) • Use of forum, blogs or advertisements to inject exploit URL (we focus on this)
Interaction main phases • Visit website => downloads initial exploit script • The script targets browsers (and its plugins) => upon successful exploit starts automatic execution of malware binaries • Drive-by download Evading techniques: Obfuscated JavaScript, or use of a number of redirection steps
Infrastructure and Methodology Useful terms: • Malicious URL: on visit initiates drive-by download • Landing site: group of URLs according to top level domain names • Distribution site: host of the malicious payload
Preprocessing phase • Web repository maintained by Google • For each website extract: • Out of place iFrames • Obfuscated JavaScript • iFrames to known distribution sites • Pages that proceed to more expensive verification process: • Those labeled as suspicious from the above procedure (1 million / day) • Random selection of several hundred thousands URLs • URLs were reported by users
Verification phase • Large scale honeynets (Windows VMs) • For each URL VM monitors: • File system changes • Newly created processes • Changes to system registry + Virus scan of the response packets Whoever meets these 4 requirements is marked as malicious Very few false positives 1 million pages / day processed => ~25.000 are malicious URLs
Constructing the Malware Distribution Network • Malware distribution network=> set of malware delivery trees from the landing site (leafs & nodes) to the distribution site (root) • Used the ‘Referer’ header from requests • Interpret HTML & JavaScript • extract URLS • Match with HTTP fetches (in case of randomly generated strings => heuristics based on edit distance is used to identify URL parent)
Prevalence of drive-by downloads • 1.3% of the overall incoming search queries in Google returns at least one malicious result • From the top 1 million URLs appearing in Google search engine results, about 6,000 belong to sites that are verified as malicious (the most popular landing page had rank of 1.588)
Geographic locality of web based malware • Evidence of poor security practices from administrators (outdated and/or unpatched versions of web server software) • Correlation between distribution site and landing site
Impact of browsing habits • DMOZ: knowledge base • Random selection of 7.2 million URLs mapped to corresponding DMOZ category
Malicious content Injection: Web Server Software • Examined (where possible) the web-server software of the landing site by collecting: • ‘Server’ header: the name of the server • ‘X-Powered-by’ header: the PHP version
Malicious content Injection: Drive-by Downloads via Ads • A web page is only as secure as its weakest component! • Insecure Ad content posses risk • Frequent fact: • An advertiser sells advertising space => to another advertising company => who sells the advertising space to and other company and so it goes… Somewhere along the chain something can go wrong
Malicious content Injection: Drive-by Downloads via Ads • Create malware delivery trees from detected malicious URLs • Examine every node for membership to a set of 2000 known advertising networks • 12% of the search results that returned landing sites were malicious due to unsafe Ads • All these narrow down to 55 unique Ad Networks
Malicious content Injection: Drive-by Downloads via Ads • Appear short-lived spikes • Ads appear in several sites simultaneously
Malicious content Injection: Drive-by Downloads via Ads • 75% of the landing sites deliver malware via Ads through multiple level Ad networks • 50% of all cases there were more than 6 redirection steps
Malware distribution Infrastructure • Evaluate the distribution network size (num of landing sites pointing to Distribution site) • 45% of the distribution sites have 1 landing site • Others can grow all over 21.000 landing sites • Network location of the distribution sites: • 70% of the sites are within 58.*.*.* - 61.*.*.* and 209.*.*.* - 221.*.*.* • The landing sites produced 2517 ASes. • 95% of the sites falling in 500 ASes.
Malware distribution Infrastructure • 42% of distribution sites delivered a single malware binary • 3% of the servers hosting more than 100 binaries
Relationships Among Networks • Malware hosting Infrastructure: • Detected 9.430 malware distribution sites (90% of them were hosted on a single IP) • Overlapping landing sites • IPs were found to host up to 210 distribution sites (user accounts on public hosting servers or blogs and DNS aliases) • Content replication across malware distribution sites
Post Infection Impact Overview of the collective changes of the system • Large number of executables downloaded (8 average – 60 in extreme cases) • The number of running processes is increased (80% of the case created at most 20 processes)
Post Infection Impact • Registry changes: • Increased Network activity (confirmed the connection between Web malware and botnets)
Antivirus Engine detection rates • For each URL that on visit, it created at least 1 process => we extract the binaries downloaded • Evaluation of detection rate against suspected malware from the above procedure (3 different anti-virus engines) => The best of the 3 anti-virus engines had successful detection of 70% ( the second about 55% and the third about 25%)
False Positives • Assuming that all suspicious binaries will be discovered by the anti-virus vendors: • We rescan the extracted binaries from the previous procedure after 2 months • Result: less than 10% false positive rates • Creation of a white-list decreases the rates • With direct feedback with anti-virus vendors they reported 6% false positives
Conclusion • Malicious URLs that initiate drive-by-downloads are spread. • They attempt to fill this gaps. • Then observe through the search important things (1,3% replies of Google is malicious) • Relation between Ads and malware distribution networks