1 / 12

Discovering the Fake Followers in the Micro-blogging via Machine Learning

Discovering the Fake Followers in the Micro-blogging via Machine Learning. Yi Shen Jianjun Yu October 16, 2013. Computer Network Information Center, Chinese Academy of Sciences. Micro-blogging. 2. The Celebrities in Twitter. 3. Purchasing Fake Followers. Fake followers Markets.

jane-mcgee
Download Presentation

Discovering the Fake Followers in the Micro-blogging via Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discovering the Fake Followers in the Micro-blogging via Machine Learning Yi Shen Jianjun Yu October 16, 2013 Computer Network Information Center, Chinese Academy of Sciences

  2. Micro-blogging 2

  3. The Celebrities in Twitter 3

  4. Purchasing Fake Followers

  5. Fake followers Markets

  6. A Data Report 39% of @facebook followers are fake 34% of @ladygaga followers are fake  31% of @justinbieber followers are fake  32% of @katyperry followers are fake  32% of @espn followers are fake 33% of @britneyspears followers are fake  27% of @youtube followers are fake  http://www.socialsellingu.com/fake-twitter-profiles-infographic/

  7. Troubles Caused by Fake Followers • Noise for Social Network analysis • Privacy and Security Problem • Spam Problem

  8. Method of Detection • Binary Classification Problem • Extract discriminative features • Voting-SVM as the classifier

  9. How to get ground-truth data? • Purchase from different merchants • Keep tracking them for a long period

  10. The Features for Classification • The Ratio of Followee Count and Follower Count (RFF) • The Percentage of Bidirectional Friends (PBF) • Average Repost Frequency of the Posts (ARF) • Ratio of the Original Posts (ROP) • Proportion of Nighttime Posts (PNP) • Topic Diversity • The standard deviation of post-count(σpost). • The general slope of post-count(gpost). • The standard deviation of followee-count(σ followee) • The decrease frequency of followee-count(DFfollowee). • The standard deviation of follower-count (σfollower).

  11. Result

  12. Thank you!

More Related