1 / 18

Web Spambot Detection Based on Web Navigation Behaviour

Web Spambot Detection Based on Web Navigation Behaviour. Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia. Introduction.

eara
Download Presentation

Web Spambot Detection Based on Web Navigation Behaviour

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Spambot Detection Based on Web Navigation Behaviour Pedram Hayati Vidyasagar Potdar Kevin Chai Alex Talevski Anti-Spam Research Lab (ASRL) Digital Ecosystem and Business Intelligence Institute Curtin University, Perth, Western Australia

  2. Introduction • Junk, Unrelated, Unwelcome, Anonymous content ==> spam. • Spam now not only spreads through email but also through Web 2.0. • This new trend of spamming is called as Spam 2.0.

  3. Examples of Spam 2.0 • Hosting Spam content in Web applications on legitimate websites¹. ¹ P. Hayati, V. Potdar, A. Talveski, N. Firoozeh, S. Sarenche, E. A. Yeganeh. Spam 2.0 Definition, New Spamming Boom. DEST 2010, Dubai, UAE, April 2010.

  4. Web SpamBot • A tool is used by spammer to distribute Spam 2.0. • Use the idea of Web robots. • Mimic Human user behaviour. • Waste useful resources. In order to counter Spam 2.0 We can concentrate on Web Spambot detection as Source of Spam 2.0 problem.

  5. Spam 2.0

  6. Countermeasures • Mostly on Email Spam detection. • Content based, Meta-Content based. • Applicable for Web environment like link-based detection. • CAPTCHA • Possible to bypass using ML. • Machines are better to decipher. • Inconveniences human users.

  7. Problem • Not suitable for web 2.0 platform • Spam hosts on legitimate website • Parasitic nature • We cannot make whole website blacklisted because of spam posts.

  8. Our Solution • Study Web spambot behaviour in order to stop spam 2.0. • Fundamental assumption: • spambot behaviour is intrinsically different from those of humans. • Use Web Usage Data. • Contain information about user navigation through website. • Can be gathered implicitly. • Convert web usage data into a format that can be • Extendible • Discriminative

  9. Our Solution • Propose new feature set called Action. • a set of user requested webpages to achieve a certain goal. • Example • in an online forum, a user navigates to a specific board then goes to the New Thread page to start a new topic. • This user navigation can be formulated as submitting new content action.

  10. Framework

  11. Action Extraction

  12. Algorithm

  13. Dataset • 60 days study of web spambot behaviour on a live discussion board (HoneySpam 2.0 Project). • 1 month study of human user behaviour.

  14. Action Frequency of Humans and Spambots

  15. Performance Measurement • Matthew Correlation Coefficient (MCC)

  16. Results

  17. Conclusion • We propose innovative idea by focusing on spambot identification to manage spam rather than analysing spam content. • We proposed a novel framework to detect spambots inside Web 2.0 applications, which lead us to Spam 2.0 detection. • We proposed a new feature set i.e. action navigations, to detect spambots. • We validated our framework against an online forum and achieved 96.24% accuracy using the MCC method

  18. Thank YOU! Web Spambot Detection Based on Web Navigation Behaviour • Pedram Hayati – p.hayati@curtin.edu.au • Vidyasagar Potdar – v.potdar@curtin.edu.au • Kevin Chai – k.chai@curtin.edu.au • Alex Talevski – a.talevski@curtin.edu.au • Anti-Spam Research Lab (ASRL) • Digital Ecosystem and Business Intelligence Institute • Curtin University, Perth, Western Australia • www.antispamresearchlab.com

More Related