180 likes | 318 Views
Web Spambot Detection Based on Web Navigation Behaviour. Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia. Introduction.
E N D
Web Spambot Detection Based on Web Navigation Behaviour Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia
Introduction • Junk, Unrelated, Unwelcome, Anonymous content ==> spam. • Spam now not only spreads through email but also through Web 2.0. • This new trend of spamming is called as Spam 2.0.
Examples of Spam 2.0 • Hosting Spam content in Web applications on legitimate websites¹. ¹ P. Hayati, V. Potdar, A. Talveski, N. Firoozeh, S. Sarenche, E. A. Yeganeh. Spam 2.0 Definition, New Spamming Boom. DEST 2010, Dubai, UAE, April 2010.
Web SpamBot • A tool is used by spammer to distribute Spam 2.0. • Use the idea of Web robots. • Mimic Human user behaviour. • Waste useful resources. In order to counter Spam 2.0 We can concentrate on Web Spambot detection as Source of Spam 2.0 problem.
Countermeasures • Mostly on Email Spam detection. • Content based, Meta-Content based. • Applicable for Web environment like link-based detection. • CAPTCHA • Possible to bypass using ML. • Machines are better to decipher. • Inconveniences human users.
Problem • Not suitable for web 2.0 platform • Spam hosts on legitimate website • Parasitic nature • We cannot make whole website blacklisted because of spam posts.
Our Solution • Study Web spambot behaviour in order to stop spam 2.0. • Fundamental assumption: • spambot behaviour is intrinsically different from those of humans. • Use Web Usage Data. • Contain information about user navigation through website. • Can be gathered implicitly. • Convert web usage data into a format that can be • Extendible • Discriminative
Our Solution • Propose new feature set called Action. • a set of user requested webpages to achieve a certain goal. • Example • in an online forum, a user navigates to a specific board then goes to the New Thread page to start a new topic. • This user navigation can be formulated as submitting new content action.
Dataset • 60 days study of web spambot behaviour on a live discussion board (HoneySpam 2.0 Project). • 1 month study of human user behaviour.
Performance Measurement • Matthew Correlation Coefficient (MCC)
Conclusion • We propose innovative idea by focusing on spambot identification to manage spam rather than analysing spam content. • We proposed a novel framework to detect spambots inside Web 2.0 applications, which lead us to Spam 2.0 detection. • We proposed a new feature set i.e. action navigations, to detect spambots. • We validated our framework against an online forum and achieved 96.24% accuracy using the MCC method
Thank YOU! Web Spambot Detection Based on Web Navigation Behaviour • Pedram Hayati – p.hayati@curtin.edu.au • Vidyasagar Potdar – v.potdar@curtin.edu.au • Kevin Chai – k.chai@curtin.edu.au • Alex Talevski – a.talevski@curtin.edu.au • Anti-Spam Research Lab (ASRL) • Digital Ecosystem and Business Intelligence Institute • Curtin University, Perth, Western Australia • www.antispamresearchlab.com