250 likes | 353 Views
All Your iFRAME s Point to Us. 鲍由之 12/27/2010. Review. 《The Ghost In The Browser Analysis of Web-based Malware》. Overview. Section 2: background information Section 3: overview of our data collection infrastructure Section 4: prevalence of malicious web sites on the Internet.
E N D
All Your iFRAMEs Point to Us 鲍由之 12/27/2010
Review • 《The Ghost In The Browser Analysis of Web-based Malware》
Overview • Section 2: background information • Section 3: overview of our data collection infrastructure • Section 4: prevalence of malicious web sites on the Internet. • Section 5: Mechanisms used to inject malicious content into web pages. • Section 6: aspects of the web malware distribution networks • Section 7: impact of the installed malware on the infected system. • Section 8: Implications of our results • Section 9: Related work. • Section 10: Conclusion
2 Background • Objective: installing malware on a user’s computer • 1: remotely exploit vulnerable network services • Less successful • 2: lure web users to connect to (compromised) malicious servers that subsequently deliver exploits targeting vulnerabilities of web browsers or their plugins
via vulnerable scripting applications • phpBB2 or InvisionBoard • Posting to forums or blogs
3 Infrastructure and Methodology • Objective: Identify malicious web sites • Landing pages = malicious URLs • Landing sites = malicious URLs collected according to top level domain names
Pre-processing Phase • Extract several features and translate them into a likelihood score using machine learning framework • Map-reduce • 5-fold cross-validation • These URLs are randomly sampled from popular URLs as well as from the global index. We also process URLs reported by users. • 1 billion -> 1 million
Verification Process • Equipment: a large scale web-honeynet runs Microsoft Windows images with unpatchedie in virtual machine. • Method: Execution based heuristics & Anti-virus engine • Heuristics score: the number of create process; the number of observed registry changes; the number of file system changes • Met threshold: suspicious • Met threshold and marked as malicious by at least one anti-virus engine: malicious • 1 million -> 25,000
Constructing the Malware Distribution Networks • A set of malware delivery trees, which consists of landing sites(leaf), hop points and distribution site(root) • REFER headers in HTTP request • Redirection and no REFER header set: looking up the extracted URLs from fetched page into HTTP fetches that are subsequently visited by the browser?? • Containing randomly generated strings: apply heuristics based on edit distance to identify the most probable parent of the URL
4 Prevalence of Drive-by Downloads • Jan 2007 - Oct 2007 • 6000 in top 1 million, uniformly distributed
4.1 Impact of browsing habits • Malicious websites are generally present in all website categories we observed. • “safe browsing” does not provide an effective safeguard against exploitation.
5 Malicious Content Injection • Two categories: web server compromise and third party contributed content • 5.1 Web sever compromise: Web Server Softsware • SERVER and X-Powered-By header
5.2 Drive-by Downloads via Ads • For each tree, we examine every intermediary node for membership in a set of 2, 000 well known advertising networks. If any of the nodes qualify, we count the landing site as being infectious via Ads. • we weight the landing sites associated with Ads based on the frequency of their appearance in Google search results compared to that of all landing sites.
2% of the landing sites were delivering malware via advertisements. • On average, 12% of the overall search results that returned landing pages were associated with malicious content due to unsafe Ads.
Malware delivered via Ads exhibits longer delivery chains, in 50% percent of all cases, more than 6 redirection steps were required before receiving the malware payload.
the advertising network appears at the beginning of the delivery chain. • advertising networks appear frequently in the middle of the delivery chains. • the advertising network is directly delivering malware
6 Malware Distribution Infrastructure • 45% of the detected malware distribution sites used only a single landing site at a time. • 70% of the malware distribution sites have IP addresses within 58.* -- 61.* and 209.* -- 221.* network ranges.
All the malware distribution sites’ IP addressed fall into a relatively small set of ASes – only 500 as of writing. • approximately 42% of the distribution sites delivered a single malware binary. • the multiple payloads reflect deliberate obfuscation attempts to evade detection
6.1 Relationships Among Networks • Malware hosting infrastructure • 10% sites are hosted on IP addresses that host multiple malware distribution sites. • Closer inspection revealed that these addresses refer to public hosting servers that allow users to create their own accounts.
Overlapping landing sites • 80% of the distribution networks share at least one landing page • many landing sites are shared among multiple distribution networks • Content replication across malware distribution sites • 25% of the malware distribution sites, • at least one binary is shared between a pair of sites
7 Post Infection Impact • Objective: give an overview of the collective changes that happen to the system state after visiting a malicious URL • Download executables • New-created process • Registry change
7.1 Anti-virus engine detection rates • The detection capability of the anti-virus engine is lacking, with an average detection of 70% for the best one. • Problem: False Positives • optimistically assume that all suspicious binaries will eventually be discovered by the anti-virus vendors. • Less than 10% • A white-list of benign downloads
8 Discussion • we believe the pervasive nature of the results in this study elucidates the state of the malware problem today, and hopefully, serves to educate both users, web masters and other researchers about the security challenges ahead.
9 Related Work • Moshchuk et al., finding links to executables labeled spyware by an adware scanner • Not deep enough • This paper differs from all of these works in that it offers a far more comprehensive analysis of the different aspects of the problem posed by web-based malware, including an examination of its prevalence, the structure of the distribution networks, and the major driving forces.
10 Conclusion • we attempt to fill in the gaps about this growing phenomenon by providing a comprehensive look at the problem from several perspectives by a large scale of data • our analysis reveals several forms of relations between some distribution sites and networks. • we show that merely avoiding the dark corners of the Internet does not limit exposure to malware