1 / 34

A Crawler-based Study of Spyware on the Web

A Crawler-based Study of Spyware on the Web. A.Moshchuk, T.Bragin, D.Gribble, M.Levy NDSS, 2006 * Presented by Justin Miller on 3/6/07. A Quick Joke…. “I caught a little of that computer virus that’s been going around… I haven’t been myself since” www.CartoonStock.com. Overview. vs.

jalen
Download Presentation

A Crawler-based Study of Spyware on the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Crawler-based Study of Spyware on the Web A.Moshchuk, T.Bragin, D.Gribble, M.Levy NDSS, 2006 * Presented by Justin Miller on 3/6/07

  2. A Quick Joke… “I caught a little of that computer virus that’s been going around… I haven’t been myself since” www.CartoonStock.com

  3. Overview vs. • User visits website • Web spyware infects computer • Computer is unhappy

  4. Background • Spyware study • Infected 80% of AOL users • 93 spyware components (known) • Goals • Locate spyware on the internet • Gather Internet spyware statistics • Quantitative analysis of spyware-laden content on the web

  5. Outline • What is spyware? • Crawling the web • Web executables • Drive-by downloads • Results • Improvements

  6. Definition • Spyware – software that collects personal information about users • No user knowledge • Spyware techniques: • Log keystrokes • Collect web history • Scan documents on hard disk

  7. Types of Spyware • Spyware-infected executables • Content-type header • URL extension • Drive-by downloads • Malicious web content • Produce event triggers

  8. Part I: Executable files • Finding executables • Content-type (HTTP header) contains .exe • URL contains .exe, .cab, or .msi • Hidden executables • Embedded file (.zip) • URL hidden in JavaScript • Missed executables • Hidden URL on dynamic page

  9. Part I: Executable files • DL, install, run in a clean VM • Tool to automate installer framework • EULA agreements • Radio buttons and check boxes • Analyze file • Ad-Aware software • Log identifies spyware program

  10. Web Crawling • Heritrix public domain Web crawler • Search 2,500+ web sites • c|net’s download.com for DL executables • Randomly selected web sites • Google keyword search • Depth of 3 links • Find .exe hosted on separate Web servers

  11. Changing Spyware Environment • 2 separate program crawls • May, October 2005 • Generated list of crawling seeds • Most recent anti-spyware program used • October crawl detect mores vulnerabilities

  12. Executable Results • 2 separate program crawls • May 2005 – 18 million URLs • Oct 2005 – 22 million URLs • No appreciable change in spyware • One site dropped # of infected executables

  13. Executable Results • Overall spyware • 3.8% in May 2005 • 4.4% in Oct 2005 • Individual programs • 82 in May 2005 • 89 in Oct 2005

  14. Infected Executables May 2005 October 2005

  15. Web Categories • Web categories infected with spyware

  16. Spyware Functions • Spyware-infected executables • Contain various spyware functions • Executables may have multiple functions

  17. Spyware Upgrades • Spyware-infected executables • May have multiple spyware functions • 1,294 infected .exe found in Oct 2005 • 880 detected • 414 variants

  18. Blacklisting Spyware • Block clients from accessing listed sites • Done by firewall or proxy • Blacklisting is ineffective

  19. Part II: Drive-by Downloads • Spyware from visiting a web page • Javascript embedded in HTML • Modifies files • System/registry • Render web pages with unmodified browser

  20. Event Triggers for DB-DLs • Event occurs that matches a trigger • Trigger Conditions • Process creation • File activity (creation) • Suspicious process (file modification) • Registry file modified • Browser/OS crash

  21. Complex Web Content • “Time Bomb” attack • Speed up virtual time of guest OS • JavaScript when page closes • Fetch a clean URL before closing • Pop-up windows • Allow all to open before closing

  22. IE Browser Configuration • Security-related IE dialog boxes

  23. Drive-by Results • 3 web crawls • May 2005 – 45K URLs • Oct 2005 – Same URLs • Oct 2005 – New URLs • Decrease in infectious URLs • Increase in unique spyware programs

  24. Drive-by Results

  25. Origin of Drive-by DLs • Top 6 web categories (IE): • Pirate sites • Celebrity • Music • Adult • Games • Wallpaper

  26. Spyware Top 10 May 2005 October 2005

  27. Spyware Top 10 May 2005 October 2005

  28. Spyware Trends • Decline in total # of spyware programs • Increase of anti-spyware tools • Automated patch installations • Lawsuits against spyware distributors

  29. IE vs Firefox Security • Internet Explorer v6 • 186 - cfg_y • 92 - cfg_n • Firefox v1.0.6 • 36 - cfg_y • 0 - cfg_n

  30. Drive-by Summary • Performed 3 URL crawls • Reduction in % of domains hosting DB-DLs • Small # of domains host majority of infectious links • Drive-by DLs attempted in 0.4% of URLs • Drive-by attacks in 0.2% of URLs

  31. Strengths • Analysis method • Studies density of spyware on the Web • Produces spyware trends over time • Calculated frequency of spyware on web • Distinguished security prompts (y/n) • Found 14% of spyware is malicious • Density of spyware is substantial

  32. Weaknesses • Missed executables • URL hidden in JavaScript, dynamic page • Limited by what Ad-Aware is able to detect • Method weakness • Different anti-spyware programs (May/Oct) • Did not crawl entire web • Cannot relate density of spyware on the Web and the presence of threats on desktops

  33. Improvements • Test multiple browsers • Additional anti-spyware programs • Crawl more URLs • Find geographic patterns of hosts

  34. Questions? • Ask me! • Reasons to ask questions: • Class discussion is 20% of your grade • You can’t leave until 5:45 anyway • Of the two of us, I’m probably the only one that read the entire paper (except Dr. Zou)

More Related