1 / 12

LOBO–Evaluation of Generalization Deficiencies in Twitter Bot Classifiers

LOBO–Evaluation of Generalization Deficiencies in Twitter Bot Classifiers. Juan Echeverria, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Gianluca Stringhini, and Shi Zhou University College London, Telefonica Research, Boston University. Introduction. What are bots ?

kcook
Download Presentation

LOBO–Evaluation of Generalization Deficiencies in Twitter Bot Classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LOBO–Evaluation of Generalization Deficiencies in Twitter Bot Classifiers Juan Echeverria, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Gianluca Stringhini, and Shi Zhou University College London, Telefonica Research, Boston University

  2. Introduction • What are bots ? • Web Robot or Software application that runs Automated tasks • What are Twitter bots and what is their purpose? • Affecting the regular flow of discussion, attacking regular users and their posts, spamming them with irrelevant or offensive content, and even manipulating the popularity of messages and accounts. • What are Bot Classifiers and how do they work ? • They are methodologies used to detect the online bots. They are trained to detect behavior of accounts that appear suspicious or malicious.

  3. Introduction Cont. • Need for generalized bot detection. • Evaluating bot classifiers on unseen bots. • Failure of existing bot classifiers to detect new bot behavior. • Results of over 97% accuracy when trained or varied data.

  4. Problems with Existing Methodologies • Training the classifier with one type of Botnets. • Small modification after knowing defense mechanism can avoid being detected. • Newer botnets fallout of this category as the classifier was not trained on these features to consider it as a botnet

  5. Data Collection • Common Characteristics of Bots? • Tweets related to a common subject • No followers • Tweets created immediately after account creation • Fake follower services • Paid apps. Retweets of particular person/group • Honeypot accounts

  6. Data Collection

  7. Data Collection Claimed to be one of the largest and most varied dataset

  8. Methodology • Collecting dataset with more than 20 different botnet classes. Most of them used previously • Produce a generalized bot detection algorithm. • The training and testing is done using LOBO • Leave One Botnet Out • In this case trains the machine on 19 botnet classes and tests its accuracy on the 20th.

  9. Features Classification • User Features • Tweet Features

  10. Sampling Data • C30K • Take 30k bots from each of the first 3 bot classes. • Take all bots from remaining classes. • Take 105,000 Users to balance. • C500 • Take 500 bot samples from each class having more than 500. • Take 500 Users.

  11. Results

  12. Thank You

More Related