170 likes | 325 Views
Fake News on Facebook. Deep Learning Research & Application Center 04 December 2017 Claire Li. Content. Popular F acebook sites with connection to the famous fact-checking websites with ground-truth labels Snopes.com https://www.facebook.com/snopes / , 588,623 followers
E N D
Fake News on Facebook Deep Learning Research & Application Center 04 December 2017 Claire Li
Content • Popular Facebook sites with connection to the famous fact-checking websites with ground-truth labels • Snopes.com • https://www.facebook.com/snopes/, 588,623 followers • https://twitter.com/snopes • TruthorFiction.com • https://www.facebook.com/TRUTHORFICTIONCOM-336495334750/, 9,274 followers • Examples of fake news (rumors) on Facebook • Dynamic tree structure of news propagation • Computational features in on line media • Related works
trustWorthinessScore • The task of fake news detection is defined here as the prediction of the chances of a particular news article (news report, editorial, expose, etc.) being intentionally deceptive (fake, fabricated, staged news, Or a hoax) [1] • by giving multiple rating labels of • True, false, mostly-true, mostly-false, unverified
https://www.facebook.com/snopes/ • Rating from Snopes.com: mostly true • # of shares: 1,511 within 11 hours, 2,182 at 9am on 29/11 • positive (True/Mostly true) stance and negative (false/mostly false) stance Did President Donald Trump Reverse an Insecticide Ban After Receiving $1 Million from Dow Chemicals? Posted at 7:19pm on 28/11/2017
False • On average, this happens in one month each year, but the meme claims it happens only once every 823 • Interestingly the variant shared in the photo above was correct in July 2011 but not in 2013 • Shared 1,259,642 times and received 174,728 comments, 908 of which linked to Snopes.com [4]
[4] [4]
Structure of shares/comments of Events on Twitter/Facebook A dynamic tree structure of fake news propagation with timelines [9] • Observation on fake news (rumors) propagation • Initialized by a low-impact user • Boosted by influential users/communities • cue terms e.g., ‘false’, ‘debunk’, ‘not true’, etc. • commonly occurring in rumors but not in non-rumors
Computational features in on line media • Lexical/linguistic based features • Message contents • Average length, positive/negative words, #tags, contained URLs, question/exclamation marks, n-grams etc • user profiles • Personal description/picture, background (what kind of social circles - source credibility), location, # of followers/friends, active levels, post (share, comment, likes) history, account age • Social behavior based features • Events (initial posts/articles) • histogram of # of likes, histogram of # of response posts (maximum, minimum, average) • diffusion patterns of the posts • time (am|pm|intervals), time duration, diffusion frequency • Propagation temporal patterns [1, IJCAI-16] • post volume [3, 7] • analyzing comments containing links to rumor debunking websites (Snopes.com) [4] • receiving a Snopes comment increases the likelihood that a reshare of a false rumor will be deleted (4.4 times more likely to delete it) • can accumulate up to hundreds of Snopes comment during propagation • Propagation structures of networks[2, ACL 2017]
Content-based features [8, ACM 2015] in # of hours since initial tweets differences between the two cases are neither clear-cut nor strong for feature engineering[1, IJCAI 2014] Rumors have much longer life time than non-rumors [7, AAAI 2014] [7]
Related works 1. Detecting Rumors from Microblogs with Recurrent Neural Networks, IJCAI-16, (binary classifier) • ground truth labels in total containing more than 5,000 claims that scale to five million relevant microblog posts • Prior works classify the veracity of spreading memes using information other than the text content • The number of retweets or replies of the post • the features relevant to determine a user’s credibility • RNN-basedmethod disregards this completely • Batch posts into time intervals and treat them as a single unit in a time series that is then modeled using an RNN sequence (due to tens of thousands of posts for an event) • use the tf*idfvalues of the vocabulary terms from the posts in each interval as input • RNN-based method detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services
2. Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning ACL 2017 (multi-class classifier) • based on the rumors propagation structure using kernel based Propagation Tree Kernel (PTK) • Train a kernel-based SVM classifier • Given a kernel tree of source tweet, predicts its label through estimating the similarity with all training instances within the training space • Structural, linguistic and temporal features embedded into the PTK as node (post) vectors of • Creator of node vi (# of followers/friends, # of history posts, validity status etc.) • Text content: unigram/bi-gram • Time lag from source tweet r to vi • Edge from vi to vj represents the node vj retweets/replies vi
Related works Prominent features of rumor propagation in online social media. In Proceedings of ICDM, 2013 Rumor cascades. In Proceedings of ICWSM, 2014 Detect rumors using time series of social context information on microblogging websites. In Proceedings of CIKM, 2015 Crawling Facebook for Social Network Analysis Purposes, ACM 2011 Modeling Bursty Temporal Pattern of Rumors, AAAI 2014 Detect Rumors Using Time Series of Social Context Information on Microblogging Websites, ACM 2015 Automatic Detection and Verification of Rumors on Twitter PhD Thesis
Influential facebook websites Discover Hong Kong 3,604,378 followers (HK) Appledaily2,139,422 (HK) Gateway Pundit 600,289 followers Reddit 1,122,501 followers Donald J. Trump 24M followers