240 likes | 388 Views
Phishing Information Recycling from Spam Mails. 許富皓 資訊工程學系 中央大學. OUTLINE. Introduction System Overview Evaluation Discussion and Conclusion. Statistics of Spam Mails. The global spam rate for Q3 2009 is 88.1%, equating to around 151 billion emails a day.
E N D
Phishing Information Recycling from Spam Mails 許富皓 資訊工程學系 中央大學
OUTLINE • Introduction • System Overview • Evaluation • Discussionand Conclusion
Statistics of Spam Mails • The global spam rate for Q3 2009 is 88.1%, equating to around 151 billion emails a day. • The major purpose of these spam mails is for advertising. • In Q3 2009, phishing activity was 1 in 368.6 emails.
Phishing • Aims to steal sensitive information from users. • A phishing attack usually comprises two steps: • Prepare a forged web page • Send spoofed e-mails
Phishing E-mail Example • Phishing e-mails would fool users to visit a forged web page. • An example of a phishing e-mail.
Phishing Web Page Example • A phishing web page would look like a real service web page. • An example of a phishing page
Anti-Phishing Methods • Email level solution • Filters and content-analysis • Browser-integrated solution • SpoofGuard • PwdHash • AntiPhish • Keeps track of sensitive information • DOMAntiPhish • Compared the DOMs of the pages
Most Popular Phishing Solutions • The most popular and widely-deployed solutions are based on blacklists. • IE 7 browser • Google Safe Browsing • NetCraft tool bar • eBay tool bar • ..etc
Drawbacks of Current Solutions • APWG detected more than 40,000 unique phishing URLs in Aug. 2009. • On average, a phishing domain lasts 3 days. • Many e-mail receivers trust the e-mails that have passed the examination of an e-mail filter.
Why Phishing Works ? • Why Phishing Works Proc. CHI (2006) • SMTP does not contain any authentication mechanisms • 23% users base their trust only on page content • None of the solutions are foolproof. • About five million U.S. consumers gave information to spoofed websites resulting in direct losses of $1.7 billion (2008).
Observation • The phishing domain lasts 3 days, so the phishing mail contains this domain must be sent in this period. • Legitimate server hosts usually create a lot of network traffic. However phishing hosts usually only have a small amount of network traffic.
Our Method - Shark • Actively counterattack phishers, not just passively defend. • The goal is to overload phishing web sites with large forged data. • Collect phishing information from spam mails. • Detect Botnet from spam mails
System Components • Agent Host • Collect phishing URLs from spam mails • Send large amount of forged data to forged websites • SQL Server • Handling the suspect URLs • Camouflage Router • Allow the agent host to use various IP addresses to establish TCP connections.
Information Recycling Components • Agent host • Simply sniffs the URLs in e-mails which pass through our camouflage router. • SQL server • Collect those URLs • Record their arrival time
Information Recycling • Classify URLs according to their domains. • Record the number of URLs appearing in each domain. • Collect suspect URLs • A URL whose domain contains more URLs than a threshold in a short period (normally 3 days) is deemed as a phishing URL.
Recognize Phishing Web Sites • Suspect web site • Parse html content • Check form tag, input tag… • type=password • Combine Google API • Check if the website has enough traffic flow • Could combines other phishing detection
Counterattack • Agent host • Initiate TCP connections to the phishing sites • Find out the form tags which can be used to submit data to the phishing sites • Send forged data to the phishing sites • Limit the number of TCP connections an agent host can establish with a phishing host (based on the number of phishing URLs) • Camouflage router • randomly choose an IP address belonging to its domain and provide it to the agent host to establish a new TCP connection with a phishing host
Effects of Counterattack • Phishers would not be able to distinguish victim data and forged data. • Login pages of legal web sites can record the IPs of hosts that use bait (forged) data to login • Hosts sent phishing e-mails or using bait data to login are usually the bots of some botnets.
Evaluation • False Negative • 2,543 phishing websites in PhishTank • False Positive • 5000 legitimate websites in Alexa • 0 false positive
Evaluation • Phishing websites in PhishTank (Total 2,543)
Contribution • A novel counterattack solution for phishing • Confuse the phishers with large forged data • Protect users even if they have been tricked to leak their private information to phishers • Botnet detection
Future Work • JavaScript • win32com.client
Thank You • Q&A