Evaluating Event Credibility on Twitter

Evaluating Event Credibility on Twitter Manish Gupta, Peixiang Zhao, Jiawei Han SIAM Data Mining 2012, Anaheim, CA 26thApril 2012

Abstract • Twitter = sensor network/news source. • Rumors spread via Twitter cause considerable damage. • Problem: Automatically assess the credibility of events. • Our solution • Credibility propagation on a network of users, tweets and events. • Enhance event credibility by graph regularization over a network of only events. • Using ~500 events extracted from two tweet feed datasets, we show that our methods are significantly more accurate (86%) than the classifier-based feature approach (72%).

Growing Twitter Influence • Twitter was launched in 2006. • In 2011, average number of tweets were 1620/sec. • Whenever a new event takes place, Twitter reaches new peaks of traffic. (8868 tweets/sec) • The volume of Twitter mentions on blogs, message boards and forums has reached the same level as Facebook. • Userbasegrowing rapidly. Users have been quite active and its userbase is spread across different age groups and across multiple countries.

Importance of Twitter Analysis • 4% content on Twitter relates to news. (Pear Analytics Report) • The Library of Congress archives every tweet. • Tweets can be summarized and opinions can impact decisions. • Popular events on Twitter keep the users aware of what is happening all across the world. • Twitter users act as a massive sensor force pushing real-time updates.

But is all the content trustworthy?

Twitter Hoaxes (Clearly incredible events) • Fake celebrity death news. • Fake events related to celebrities. • Fake events related to particular places. • Partly “spiced up” rumors. • Old news suddenly becomes popular. • Identity theft. • Erroneous claims made by politicians.

Does all the content look credible? (Seemingly incredible events) • Informally Conveyed News • thr is now a website where students can bet on grades. I love dis idea, I would bet on myself to get all C’s and make a killing!!! • (Lack/Presence of) Supportive Evidence • Yahoo is up more than 6% in after-hours trading after ousting CEO Carol Bartzhttp://ow.ly/6nP6X • Absence of any credibility-conveying words like news, breaking, latest, report. • Absence of retweets. • Lots of tweets written with first person or second person pronouns.

Problem definition • Credibility of a Twitter event e is the degree to which a human can believe in an event by browsing over a random sample of tweets related to the event. • expected credibility of the tweets that belong to event e. • Credibility of a Twitter user u is the expected credibility of the tweets he provides. • Credibility of a tweet t is a function of the credibility of related events and users, and the credibility of other tweets that make similar (supporting/opposing) assertions as the ones made by this tweet. • Problem: Given a set of events along with their related tweets and users, our aim is to find which of the events in the set can be deemed as credible. • Hypothesis:Both the content and the network provides a lot of signals for the task.

Three Approaches • Classifier-based Approach • Basic Credibility Analyzer • Exploits a network of users, tweets and events • Event Optimization-based Credibility Analyzer • Exploits a network of events.

Classifier: User Features • User has many friends and followers. • User has linked his Twitter profile to his Facebook profile. • User is a verifieduser. • User registered on Twitter long back. • User has made a lot of posts. • A description, URL, profile image, location is attached to user’s profile.

Classifier: Tweet Features • It is complete. A more complete tweet gives a more complete picture of the truth. • A professionally written tweet with no slang words, question marks, exclamation marks, full uppercase words, or smileys is more credible. • Number of words with first, second, third person pronouns. • Presence of supportive evidence in the form of external URLs. • A tweet may be regarded as more credible if it is from the most frequent location related to the event.

Classifier: Event Features • Number of tweets and retweetsrelated to the event. • Number of distinct URLs, domains, hashtags, user mentions, users, locations related to the event. • Number of hours for which the event has been popular. • Percentage tweets related to the event on the day when the event reached its peak popularity.

Classifier-based Approach • Previous work [4] suggested decision trees supported by user, tweet, event and propagation features. • Classifier approach (focuses on content) • Does not consider any network information • With high probability, credible users provide credible tweets • Average credibility of tweets is higher for credible events than that for noncredible events • Events sharing many common words and topics should have similar credibility scores. • Is not entity aware. Features that belong to users and tweets are averaged across an event and used as event features.

Fact Finding on Twitter: BasicCA • A network of users, tweets and events Users Tweets Events Event1 Event2

Inter-dependency between credibility of entities • With a high probability credible users provide credible tweets. • With a high probability, average credibility of tweets related to a credible event is higher than that of tweets related to a non-credible event. • Tweets can derive credibility from related users and events.

Tweet Implications • We would like to capture influences when computing credibility of tweets. • t1= “Bulljumpsinto crowd in Spain: A bulljumped into the stands at a bullring in northern Spaininjuring at least 30... http://bbc.in/bLC2qx”. • t2=“Damn! Check out that bull that jumps the fence into the crowd. #awesome”. Event words are shown in bold. • imp(t1→ t2) = 3/6 • imp(t2 → t1) = 3/3

Fact Finding on Twitter: BasicCAWeights and Event Implications • Weights are associated to make sure that the “comparison is done among equals”. • Event Implications • Event Credibility

Credibility Flow

Credibility Initialization to Incorporate Signals from Content SVM weights SVMf for user/tweet feature f

Fact Finding on Twitter: BasicCASumming up O(IT2/E)

Fact Finding on Twitter: EventOptCA • Events can be considered similar if they share a lot of common words or topics. • Similar events should get the same credibility label. A Subgraph of our D2011 Event Graph

EventOptCA: How to find topics? • An event is represented as E=RE+OE, where RE= core words, OE= subordinate words • TE= set of tweets that contain RE. • For each tweet tTE, find all words present in t and also in OE. • St = Maximal subset of words from S also present in t. Compute St for every tTE. • Most frequent St’s are topics for the event.

Fact Finding on Twitter: EventOptCA Similar events should have similar credibility scores Change in event credibility vector should be as less as possible Trust Region Reflexive Algorithm implementation in Matlab (quadprog) 2(D-W)+I has all square terms and so is positive definite. So, the problem has a unique global minimizer.

Fact Finding on Twitter: EventOptCA O(IT2/E+En)

Experiments

Datasets • D2010 • Events supplied by Castillo [4], using Twitter Monitor for Apr-Sep 2010 • 288 events were labeled as credible or not credible • Remove those events which have less than 10 tweets to get 207 events. • News obtained using Google News Archives • D2011 • Events identified from March 2011 tweet feed of 34M tweets. • Events are Twitter Trends for March 2011. We removed events with <100 tweets. We selected 250 news events. • News obtained from Google News RSS feeds. • Sentiment analysis: General Inquirer dictionaries. • Labeling • Google news archives. • Step 1: Check if event occurred in news headlines. • Step 2: Check if random 10 tweets look credible.

Accuracy Comparison • Classifier based approach provides ~72% accuracy. • Our simple fact finder based approach provides some boost. • Updating the event vector using the event-graph based optimization provides significant boost in accuracy. (14% over the baseline.) Varying the parameter

Important features Important Features for D2010 Important Features for D2011

Effect of removing implications • Without considering implications, we notice a significant loss in accuracy.

Accuracy variation (wrt #iterations) Note that accuracy improves as the number of iterations increase. After 3-4 iterations, the accuracy drops a little and then becomes stable.

Case Study 1: Most Credible Users • Most credible users often include “news reporting Twitter users”. • Sometimes, celebrity users or automated users which tweet a lot also turn up at the top.

Case Study 2: Incredible Event • Event: 24pp • Tweets • #24pp had enough - I can take small doses of all the people on the show but not that much. • #24pp i love David Frost • #24pp is the best way to spend a saturday :) • #24pp Sir David Frost is hilarious!! • Andy Parsons is being fucking brilliant on this #24PP edition of #MockTheWeek ... shame no-one else is really ... • back link habits http://emap.ws/7/9Chvm4 DMZ #24pp Russell Tovey • @benjamin_cook and @mrtinoforever are SO visible in the audience from the back of their heads alone! #24pp • @brihonee I just setup my comptuer to my TV so I can watch #24pp until the end! • #buzzcocks this is turning into a very funny episode #24pp • David Walliams has a cheeky smile. #24pp • Classifier marks it positive, but BasicCA marks it as not credible, essentially because tweet implications are really low.

Case Study 3: Event Similarity • EventOptCA exploits similarity between events. • Consider the three events: • wembley • ghana • englandvsghana • BasicCA marks “ghana” and “englandvdghana” as credible correctly but marks “wembley” incorrectly. • However, all three events share a lot of common words and aspects and so are marked as credible by EventOptCA.

Drawbacks of our methods • Deep NLP techniques can be used to obtain more accurate tweet implications. • Evidence about an event being rumorous may be present in tweets themselves. • Deep semantic parsing of tweets may be quite inefficient. • Entertainment events tend to look not so credible, as the news is often written in a colloquial way. • Predicting credibility for entertainment events may need to be studied separately.

Related Work • Credibility Analysis of Online Social Content • Credibility Analysis for Twitter • Fact Finding Approaches • Graph Regularization Approaches

Related Work • Credibility Analysis of Online Social Content • Credibility perception depends on user background (Internet-savvy nature, politically-interested, experts, etc.), author gender, content attribution elements like author’s image, type of content (blog, tweet, news website), and the way of presenting the content (e.g. presence of ads, visual design). • 13% Wikipedia articles have mistakes on average. [5] • In general, news > blogs> Twitter [27] • Credibility Analysis for Twitter • Not credible because (1) Friends/followers are unknown (2) Tracing origin is difficult (3) Tampering with original message while retweeting is easy. [21] • Truthy service from Indiana University (Crowd Sourcing) [19] • Warranting impossible. [22] • Classifier approach to establish credibility. [4]

Related Work • Fact Finding Approaches • TruthFinder [30] • Different forms of credibility propagation [17] • Extensions like hardness of claims, probabilistic claims [2,3,8,9,29] • Graph Regularization Approaches • Mainly studied in semi-supervised settings • Has been used for a lot of applications like webpage categorization, large-scale semi-definite programming, color image processing, etc. [12, 28, 32]

Conclusions • We explored the possibility of detecting and recommending credible events from Twitter feeds using fact finders. • We exploit • Tweet feed content-based classifier results. • Traditional credibility propagation using a simple network of tweets, users and events. • Event Graph based optimization to assign similar scores to similar events. • Using two real datasets, we showed that credibility analysis approach with event graph optimization works better than the basic credibility analysis approach or the classifier approach.

References [1] R. Abdulla, B. Garrison, M. Salwen, and P. Driscoll. The Credibility of Newspapers, Television News, and Online News. Association for Education in Journalism and Mass Communication, Jan 2002. [2] R. Balakrishnan. Source Rank: Relevance and Trust Assessment for Deep Web Sources based on Inter-Source Agreement. In International Conference on World Wide Web (WWW), pages 227–236. ACM, 2011. [3] L. Berti-Equille, A. D. Sarma, X. Dong, A. Marian, and D. Srivastava. Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence. In Conference on Innovative Data Systems Research (CIDR). www.crdrdb.org, 2009. [4] C. Castillo, M. Mendoza, and B. Poblete. Information Credibility on Twitter. In International Conference on World Wide Web (WWW), pages 675–684. ACM, 2011. [5] T. Chesney. An Empirical Examination of Wikipedia’s Credibility. First Monday, 11(11), 2006. [6] I. Dagan and O. Glickman. Probabilistic Textual Entailment: Generic Applied Modeling of Language Variability. In Learning Methods for Text Understanding and Mining, Jan. 2004. [7] I. Dagan, O. Glickman, and B. Magnini. The PASCAL Recognising Textual Entailment Challenge. In Machine Learning Challenges, volume 3944, pages 177–190. Springer, 2006. [8] A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating Information from Disagreeing Views. In ACM International Conference on Web Search and Data Mining (WSDM), pages 131–140. ACM, 2010. [9] M. Gupta, Y. Sun, and J. Han. Trust Analysis with Clustering. In International Conference on World Wide Web (WWW), pages 53–54, New York, NY, USA, 2011. ACM. [10] A. Hickl, J. Williams, J. Bensley, K. Roberts, B. Rink, and Y. Shi. Recognizing Textual Entailment with LCC’s GROUNDHOG System. In PASCAL Challenges Workshop on Recognising Textual Entailment, 2006. [11] E. Kim, S. Gilbert, M. J. Edwards, and E. Graeff. Detecting Sadness in 140 Characters: Sentiment Analysis of Mourning Michael Jackson on Twitter. 2009.

References [12] O. Lezoray, A. Elmoataz, and S. Bougleux. Graph Regularization for Color Image Processing. Computer Vision and Image Understanding, 107(1–2):38–55, 2007. [13] M. Mathioudakis and N. Koudas. TwitterMonitor : Trend Detection over the Twitter Stream. International Conference on Management of data (SIGMOD), pages 1155–1157, 2010. [14] MATLAB. version 7.9.0.529 (R2009b). The MathWorks Inc., Natick, Massachusetts, 2009. [15] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to theWeb. In International Conference on World Wide Web (WWW), pages 161–172. ACM, 1998. [16] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: Sentiment Classification using Machine Learning Techniques. In Empirical Methods in Natural Language Processing (EMNLP), pages 79–86. ACL, 2002. [17] J. Pasternack and D. Roth. Knowing What to Believe (When You Already Know Something). In International Conference on Computational Linguistics (COLING). Tsinghua University Press, 2010. [18] A. M. Popescu and M. Pennacchiotti. Detecting Controversial Events From Twitter. In ACM International Conference on Information and Knowledge Management (CIKM), pages 1873–1876. ACM, 2010. [19] J. Ratkiewicz, M. Conover, M. Meiss, B. G. calves, S. Patil, A. Flammini, and F. Menczer. Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams. arXiv, Nov 2010. [20] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake Shakes Twitter users: Real-Time Event Detection by Social Sensors. In International Conference on World Wide Web (WWW), pages 851–860. ACM, 2010. [21] M. Schmierbach and A. Oeldorf-Hirsch. A Little Bird Told Me, so I didn’t Believe It: Twitter, credibility, and Issue Perceptions. In Association for Education in Journalism and Mass Communication. AEJMC, Aug 2010. [22] A. Schrock. Are You What You Tweet? Warranting Trustworthiness on Twitter. In Association for Education in Journalism and Mass Communication. AEJMC, Aug 2010.

References [23] S. Soderland. Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning, 34:233–272, 1999. [24] A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In International AAAI Conference on Weblogs and Social Media (ICWSM). The AAAI Press, 2010. [25] P. Viola and M. Narasimhan. Learning to Extract Information from Semi-Structured Text using a Discriminative Context Free Grammar. In International ACM Conference on Research and Development in Information Retrieval (SIGIR), pages 330–337. ACM, 2005. [26] V. V. Vydiswaran, C. Zhai, and D. Roth. Content-Driven Trust Propagation Framework. In International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 974–982. ACM, 2011. [27] C. R. WebWatch. Leap of Faith: Using the Internet Despite the Dangers, Oct 2005. [28] K. Q. Weinberger, F. Sha, Q. Zhu, and L. K. Saul. Graph Laplacian Regularization for Large-Scale Semidefinite Programming. In Advances in Neural Information Processing Systems (NIPS), 2007. [29] X. Yin and W. Tan. Semi-Supervised Truth Discovery. In International Conference on World Wide Web (WWW), pages 217–226. ACM, 2011. [30] X. Yin, P. S. Yu, and J. Han. Truth Discovery with Multiple Conflicting Information Providers on the Web. IEEE Transactions on Knowledge and Data Engineering (TKDE), 20(6):796–808, 2008. [31] G. Zeng and W. Wang. An Evidence-Based Iterative Content Trust Algorithm for the Credibility of Online News. Concurrency and Computation:Practice and Experience (CCPE), 21:1857–1881, Oct 2009. [32] T. Zhang, A. Popescul, and B. Dom. Linear Prediction Models with Graph Regularization for Web-page Categorization. In International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 821–826. ACM, 2006.

Thanks!

Evaluating Event Credibility on Twitter

Evaluating Event Credibility on Twitter

Presentation Transcript

Credibility

CREDIBILITY

Credibility on the Internet

Credibility

Credibility

Local Event Locating and Mining on Twitter Data

Open Domain Event Extraction from Twitter

Data Mining Chapter 5 Credibility: Evaluating What’s Been Learned

CREDIBILITY

Not on Twitter?

Evaluating Credibility and Reliability on OGT questions

Credibility

Participate on Twitter

credibility

Credibility

Character Credibility

Evaluating Event Credibility on Twitter

Credibility

Evaluating Website Credibility

Tweens On Twitter

Tips to make your event trend on twitter

Evaluating Narrator Reliability/Credibility