150 likes | 297 Views
Accurately Detect Parked Domain Typo-squatting Attacks. Mishari Almishari and Xiaowei Yang University of California, Irvine Donald Bren School of Information and Computer Sciences Computer Science Department malmisha, xwy@ics.uci.edu. Introduction.
E N D
Accurately Detect Parked Domain Typo-squatting Attacks Mishari Almishari and Xiaowei Yang University of California, Irvine Donald Bren School of Information and Computer Sciences Computer Science Department malmisha, xwy@ics.uci.edu
Introduction • Typo-Squatting refers to the act of registering domain names that are typographical errors of other popular domain names (target domains) to hijack the traffic intended to those popular domain names • Hijacking for malicous purposes • Hijacking for financial purposes
Goals & Contributions • Accurately identify typo-squatting domains • Measure the amount of traffic hijacked by squatters • Build a system that would reduce the amount of traffic to such domains
Methodology • Identifying Typos • Use edit distance of 1 as our typo definition • Less controversial in terms of typo definition • Users are more prone to make a single error than 2 or more • A study shows that 90-95% of spelling errors are of 1 mistake • Nevertheless, extending the typo definition is worth working at.
Methodology • Identifying hijacking attempts • Is being a typo domain enough? • No, 55% are not squatting • What are the common hijacking indicators? • Parked Domain / Ads Listing (88.5%) • Offensive Adult Content (3.1%) • Domain For Sale (2.1%) • Forwarding To Another Domain (8.3%) • How to identify Parked Domain? • Use Machine Learning Classifier (96%) (100%)
Experiment • Measure amount of hijacked traffic • UCI DNS traces of 8 months • 500 popular domains from Alexa Website • Steps • Pre-processing of DNS queries • Finding Typo Domains • Finding Typo Squatting Domains
Measurement Results • Typo-squatting Hits • Total of 23,989 • Ranges from 1,675 to 3,621 • Typo-squatting Domains • Total of 1,786 domains • Ranges from 347 to 530 domains
Measurement Results • Maximum Hits to Typo-squatting Domains • Could reach up to 649 hits for one domain in on month • Average Hijack Ratio • Low • 0.33% to 1%
Measurement Results • Maximum Hijack Ratio • From 82% to 100% • Most squatted Domains • Most hijacked is www.facebook.com • 2nd Most hijacked is www.youtube.com
Measurement Results • Typo Characterization • 14% of Cat 1 is missing dot • 66% of Cat 2 is from neighbor keys • 26% of Cat 2 is the same as one before or after • 42 % is from neighbor keys
Comparison With Other Typo-correctors • Google & Yahoo typo-correction web services • 15% (12%) missed by Google (Yahoo) • 99.6% (98%) of what is missed are real parked domains • 23%(31%) fwd to the same target domain
System Implementation • Successfully integrate our methodology with Mozilla Firefox browser • Second set, 94% <= 167 ms • Non Typo domains, 10 ms in avg and max is 25 ms
Classifier • Data Set is of 2,800 sample • 700 are parked domain and 2,100 general purpose domain from Yahoo Directory • Identify distinguishing features • Compute Distribution for verification • Use WEKA library to try different classification algorithms, Random Forest was the best
Conclusion • Defined and implemented an accurate identification methodology • Performed measurements that show typo-squatters are moderately successful • Integrated the methodology with a Firefox browser to detect typo-squatting domains on the fly