100 likes | 323 Views
Anti-Phishing Approaches. Lifeng Hu lh2342@columbia.edu. What is Phishing?. A n engineering attack An attempt to trick individuals into revealing personal credentials (uname, passwd, credit card info, etc) Based on faked email and websites A threat for the internet users. Damages
E N D
Anti-PhishingApproaches Lifeng Hu lh2342@columbia.edu
What is Phishing? • An engineering attack • An attempt to trick individuals into revealing personal credentials (uname, passwd, credit card info, etc) • Based on faked email and websites • A threat for the internet users • Damages - 73 million US adults received more than 50 phishing emails a year - $2.8 billion loss a year
Phishing Methods • Establish websites having similar interface/URL as famous websites • Establish cheating websites to get users’ personal information • Establish transparent website between original websites and users • Send emails containing malicious URL • Send emails containing embed malicious flash/picture files to avoid text checking of anti-phishing
False positive/negative rate of Anti-Phishing Approaches • False negative rate: the rate of phishing websites being regarded as good in all phishing websites • False positive rate: the rate of good websites being regarded as phishing in all good websites • So, the lower false rates are, the better Anti-Phishing approach is
Anti-Phishing Approaches for Specific Websites • Typically, designed by website companies • An example is Sitekey mechanism of BankOfAmerica online • Pro: False negative rate is low False positive rate can be zero • Con: Not applicable for phishing emails
Anti-Phishing Approaches Based on Database • Anti-phishing Firewall : Kaspersky • Anti-phishing Toolbar : Netcraft • All based on on-line database • Toolbar can provide URL statistics data in advance • Pro: Applicable for both websites and emails False negative rate can be low False positive rate is low • Con: Need frequent updates Relatively hard to implement False negative rate increases if not up-to-date
Anti-Phishing ApproachesBased on Content • PILFER: email phishing detection based on machine-learning combining 10 filters: • IP based URL: 192.168.0.1/paypal.cgi?fix=account • Domain age from whois.net • Non-matching URL: <a href=“phishingsite.com"> paypal.com</a> • HTML email : hidden URLs • Malicious JavaScript • <More>… • Pro: Practically, false positive and negative rate are relative low Machine learning methods make it possible to improve accuracy No constant update is needed • Con: Still need updates on training data and filters to adapt new styles of phishing emails Network cost is a problem
Anti-Phishing ApproachesBased on Content (cont.) • CANTINA: phishing website detection based on TF-IDF weight - TF: the number of times a given term appears in a specific document - IDF: a measure of the general importance of the term in all documents - TF-IDF = TF/IDF, specifies term with frequency in a given document - Search five top TF-IDF words of current web page in search engine such as Google - Current web page should be in top N (30) search results to be legitimate • CANTINA also uses filters similar to PILFER to decrease false positive • Pro: False positive and negative rate are very low No constant update is needed Search engine ranking is relative hard to cheat • Con: Network cost is a problem Too many phishing website searches may affect phishing websites’ ranking