Social Turing Tests: Crowdsourcing Sybil Detection

Social Turing Tests:Crowdsourcing Sybil Detection gangw@cs.ucsb.edu Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer Science Department, UC Santa Barbara.

Sybil In Online Social Networks (OSNs) • Sybil (sɪbəl): fake identities controlled by attackers • Friendship is a pre-cursor to other malicious activities • Does not include benign fakes (secondary accounts) • Research has identified malicious Sybils on OSNs • Twitter [CCS 2010] • Facebook [IMC 2010] • Renren [IMC 2011], Tuenti[NSDI 2012]

Real-world Impact of Sybil (Twitter) • Russian political protests on Twitter (2011) • 25,000 Sybils sent 440,000 tweets • Drown out the genuine tweets from protesters 900K 800K 700K Followers Jul-4 Jul-8 Jul-12 Jul-16 Jul-20 Jul-24 Jul-28 Aug-1 July 21st 100,000 new followers in 1 day 4,000 new followers/day

Security Threats of Sybil (Facebook) • Large Sybil population on Facebook • August 2012: 83 million (8.7%) • Sybils are used to: • Share or Send Spam • Theft of user’s personal information • Fake like and click fraud • Malicious URL 50 likes per dollar

Community-based Sybil Detectors • Prior work on Sybil detectors • SybilGuard[SIGCOMM’06], SybilLimit[Oakland '08], SybilInfer[NDSS’09] • Key assumption: Sybils form tight-knit communities • Sybils have difficulty “friending” normal users?

Do Sybils Form Sybil Communities? • Measurement study on Sybils in the wild [IMC’11] • Study Sybils in Renren (Chinese Facebook) • Ground-truth data on 560K Sybils collected over 3 years • Sybil components: sub-graphs of connected Sybils 5 • Sybil components are internally sparse • Not amenable to community detection • New Sybil detection system is needed

Detect Sybils without Graphs • Anecdotal evidence that people can spot Sybil profiles • 75% of friend requests from Sybils are rejected • Human intuition detects even slight inconsistencies in Sybil profiles • Idea: build a crowdsourced Sybil detector • Focus on user profiles • Leverage human intelligence and intuition • Open Questions • How accurateare users? • What factorsaffect detection accuracy? • How can we make crowdsourcedSybil detection cost effective?

Outline • Introduction • User Study • Feasibility Experiment • Accuracy Analysis • Factors Impacting User Accuracy • Scalable Sybil Detection System • Conclusion • Details in Paper

User Study Setup* • User study with 2 groups of testers on 3 datasets • 2 groups of users • Experts – Our friends (CS professorsand graduate students) • Turkers – Crowdworkers from online crowdsourcing systems • 3 ground-truth datasets of full user profiles • Renren – given to us by Renren Inc. • Facebook US and India – crawled • Sybils profiles – banned profiles by Facebook • Legitimate profiles – 2-hops from our own profiles • Data collection details *IRB Approved

Classifying Profiles Real or fake? Browsing Profiles Why? Navigation Buttons Screenshot of Profile (Links Cannot be Clicked)

Experiment Overview More Profiles per Experts

Individual Tester Accuracy Much Lower Accuracy • Experts prove that humans can be accurate • Turkers need extra help… Excellent! 80% of experts have >80% accuracy!

Wisdom of the Crowd • Is wisdom of the crowd enough? • Majority voting • Treat each classification by each tester as a vote • Majority vote determines final decision of the crowd • Results after majority voting (20 votes) • Both Experts and Turkers have almost zerofalse positives • Turker’s false negatives are still high • US (19%), India (50%), China (60%) • False positive rates are excellent • What can be done to improve turker accuracy?

Eliminating Inaccurate Turkers Dramatic Improvement Removing inaccurate turkers can effectively reduce false negatives!

Outline • Introduction • User Study • Scalable Sybil Detection System • System Design • Trace-driven Simulation • Conclusion

A Practical Sybil Detection System • Scalability • Must scale to millions of users • High accuracy with low costs • Preserve user privacywhen giving data to turkers • Details in Paper Key insight to designing our system • Accuracy in turker population highly skewed • Only 10% turkers > 90% accurate CDF (%) Accuracy (%)

System Architecture Maximize Utility of High Accuracy Turkers Crowdsourcing Layer Rejected! OSN Employees Very Accurate Turkers Turker Selection Accurate Turkers Sybils All Turkers • Continuous Quality Control • Locate Malicious Workers Heuristics Social Network User Reports Suspicious Profiles Flag Suspicious Users

Trace Driven Simulations • Simulation on 2000 profiles • Error rates drawn from survey data • Calibrate 4 parameters to: • Minimize false positives & false negatives • Minimize votes per profile (minimize cost) Very Accurate Turkers Results (Details in Paper) • Average 6 votes per profile • <1% false positives • <1% false negatives Results++ • Average 8 votes per profile • <0.1% false positives • <0.1% false negatives Accurate Turkers

Estimating Cost • Estimated cost in a real-world social networks: Tuenti • 12,000 profiles to verify daily • 14 full-time employees • Annual salary 30,000 EUR* (~$20 per hour) $2240 per day • Crowdsourced Sybil Detection • 20sec/profile, 8 hour day 50 turkers • Facebook wage ($1 per hour) $400 per day • Cost with malicious turkers • 25% of turkers are malicous • $504 per day Augment existing automated systems *http://www.glassdoor.com/Salary/Tuenti-Salaries-E245751.htm

Conclusion • Designed a crowdsourced Sybil detection system • False positives and negatives <1% • Resistant to infiltration by malicious workers • Low cost • Currently exploring prototypes in real-world OSNs

Questions? Thank you!

Social Turing Tests: Crowdsourcing Sybil Detection