520 likes | 790 Views
Evaluating Event Credibility on Twitter. Manish Gupta , Peixiang Zhao, Jiawei Han SIAM Data Mining 2012, Anaheim , CA 26 th April 2012. Abstract. Twitter = sensor network/news source . Rumors spread via Twitter cause considerable damage .
E N D
Evaluating Event Credibility on Twitter Manish Gupta, Peixiang Zhao, Jiawei Han SIAM Data Mining 2012, Anaheim, CA 26thApril 2012
Abstract • Twitter = sensor network/news source. • Rumors spread via Twitter cause considerable damage. • Problem: Automatically assess the credibility of events. • Our solution • Credibility propagation on a network of users, tweets and events. • Enhance event credibility by graph regularization over a network of only events. • Using ~500 events extracted from two tweet feed datasets, we show that our methods are significantly more accurate (86%) than the classifier-based feature approach (72%).
Growing Twitter Influence • Twitter was launched in 2006. • In 2011, average number of tweets were 1620/sec. • Whenever a new event takes place, Twitter reaches new peaks of traffic. (8868 tweets/sec) • The volume of Twitter mentions on blogs, message boards and forums has reached the same level as Facebook. • Userbasegrowing rapidly. Users have been quite active and its userbase is spread across different age groups and across multiple countries.
Importance of Twitter Analysis • 4% content on Twitter relates to news. (Pear Analytics Report) • The Library of Congress archives every tweet. • Tweets can be summarized and opinions can impact decisions. • Popular events on Twitter keep the users aware of what is happening all across the world. • Twitter users act as a massive sensor force pushing real-time updates.
Twitter Hoaxes (Clearly incredible events) • Fake celebrity death news. • Fake events related to celebrities. • Fake events related to particular places. • Partly “spiced up” rumors. • Old news suddenly becomes popular. • Identity theft. • Erroneous claims made by politicians.
Does all the content look credible? (Seemingly incredible events) • Informally Conveyed News • thr is now a website where students can bet on grades. I love dis idea, I would bet on myself to get all C’s and make a killing!!! • (Lack/Presence of) Supportive Evidence • Yahoo is up more than 6% in after-hours trading after ousting CEO Carol Bartzhttp://ow.ly/6nP6X • Absence of any credibility-conveying words like news, breaking, latest, report. • Absence of retweets. • Lots of tweets written with first person or second person pronouns.
Problem definition • Credibility of a Twitter event e is the degree to which a human can believe in an event by browsing over a random sample of tweets related to the event. • expected credibility of the tweets that belong to event e. • Credibility of a Twitter user u is the expected credibility of the tweets he provides. • Credibility of a tweet t is a function of the credibility of related events and users, and the credibility of other tweets that make similar (supporting/opposing) assertions as the ones made by this tweet. • Problem: Given a set of events along with their related tweets and users, our aim is to find which of the events in the set can be deemed as credible. • Hypothesis:Both the content and the network provides a lot of signals for the task.
Three Approaches • Classifier-based Approach • Basic Credibility Analyzer • Exploits a network of users, tweets and events • Event Optimization-based Credibility Analyzer • Exploits a network of events.
Classifier: User Features • User has many friends and followers. • User has linked his Twitter profile to his Facebook profile. • User is a verifieduser. • User registered on Twitter long back. • User has made a lot of posts. • A description, URL, profile image, location is attached to user’s profile.
Classifier: Tweet Features • It is complete. A more complete tweet gives a more complete picture of the truth. • A professionally written tweet with no slang words, question marks, exclamation marks, full uppercase words, or smileys is more credible. • Number of words with first, second, third person pronouns. • Presence of supportive evidence in the form of external URLs. • A tweet may be regarded as more credible if it is from the most frequent location related to the event.
Classifier: Event Features • Number of tweets and retweetsrelated to the event. • Number of distinct URLs, domains, hashtags, user mentions, users, locations related to the event. • Number of hours for which the event has been popular. • Percentage tweets related to the event on the day when the event reached its peak popularity.
Classifier-based Approach • Previous work [4] suggested decision trees supported by user, tweet, event and propagation features. • Classifier approach (focuses on content) • Does not consider any network information • With high probability, credible users provide credible tweets • Average credibility of tweets is higher for credible events than that for noncredible events • Events sharing many common words and topics should have similar credibility scores. • Is not entity aware. Features that belong to users and tweets are averaged across an event and used as event features.
Fact Finding on Twitter: BasicCA • A network of users, tweets and events Users Tweets Events Event1 Event2
Inter-dependency between credibility of entities • With a high probability credible users provide credible tweets. • With a high probability, average credibility of tweets related to a credible event is higher than that of tweets related to a non-credible event. • Tweets can derive credibility from related users and events.
Tweet Implications • We would like to capture influences when computing credibility of tweets. • t1= “Bulljumpsinto crowd in Spain: A bulljumped into the stands at a bullring in northern Spaininjuring at least 30... http://bbc.in/bLC2qx”. • t2=“Damn! Check out that bull that jumps the fence into the crowd. #awesome”. Event words are shown in bold. • imp(t1→ t2) = 3/6 • imp(t2 → t1) = 3/3
Fact Finding on Twitter: BasicCAWeights and Event Implications • Weights are associated to make sure that the “comparison is done among equals”. • Event Implications • Event Credibility
Credibility Initialization to Incorporate Signals from Content SVM weights SVMf for user/tweet feature f
Fact Finding on Twitter: EventOptCA • Events can be considered similar if they share a lot of common words or topics. • Similar events should get the same credibility label. A Subgraph of our D2011 Event Graph
EventOptCA: How to find topics? • An event is represented as E=RE+OE, where RE= core words, OE= subordinate words • TE= set of tweets that contain RE. • For each tweet tTE, find all words present in t and also in OE. • St = Maximal subset of words from S also present in t. Compute St for every tTE. • Most frequent St’s are topics for the event.
Fact Finding on Twitter: EventOptCA Similar events should have similar credibility scores Change in event credibility vector should be as less as possible Trust Region Reflexive Algorithm implementation in Matlab (quadprog) 2(D-W)+I has all square terms and so is positive definite. So, the problem has a unique global minimizer.
Fact Finding on Twitter: EventOptCA O(IT2/E+En)
Datasets • D2010 • Events supplied by Castillo [4], using Twitter Monitor for Apr-Sep 2010 • 288 events were labeled as credible or not credible • Remove those events which have less than 10 tweets to get 207 events. • News obtained using Google News Archives • D2011 • Events identified from March 2011 tweet feed of 34M tweets. • Events are Twitter Trends for March 2011. We removed events with <100 tweets. We selected 250 news events. • News obtained from Google News RSS feeds. • Sentiment analysis: General Inquirer dictionaries. • Labeling • Google news archives. • Step 1: Check if event occurred in news headlines. • Step 2: Check if random 10 tweets look credible.
Accuracy Comparison • Classifier based approach provides ~72% accuracy. • Our simple fact finder based approach provides some boost. • Updating the event vector using the event-graph based optimization provides significant boost in accuracy. (14% over the baseline.) Varying the parameter
Important features Important Features for D2010 Important Features for D2011
Effect of removing implications • Without considering implications, we notice a significant loss in accuracy.
Accuracy variation (wrt #iterations) Note that accuracy improves as the number of iterations increase. After 3-4 iterations, the accuracy drops a little and then becomes stable.
Case Study 1: Most Credible Users • Most credible users often include “news reporting Twitter users”. • Sometimes, celebrity users or automated users which tweet a lot also turn up at the top.
Case Study 2: Incredible Event • Event: 24pp • Tweets • #24pp had enough - I can take small doses of all the people on the show but not that much. • #24pp i love David Frost • #24pp is the best way to spend a saturday :) • #24pp Sir David Frost is hilarious!! • Andy Parsons is being fucking brilliant on this #24PP edition of #MockTheWeek ... shame no-one else is really ... • back link habits http://emap.ws/7/9Chvm4 DMZ #24pp Russell Tovey • @benjamin_cook and @mrtinoforever are SO visible in the audience from the back of their heads alone! #24pp • @brihonee I just setup my comptuer to my TV so I can watch #24pp until the end! • #buzzcocks this is turning into a very funny episode #24pp • David Walliams has a cheeky smile. #24pp • Classifier marks it positive, but BasicCA marks it as not credible, essentially because tweet implications are really low.
Case Study 3: Event Similarity • EventOptCA exploits similarity between events. • Consider the three events: • wembley • ghana • englandvsghana • BasicCA marks “ghana” and “englandvdghana” as credible correctly but marks “wembley” incorrectly. • However, all three events share a lot of common words and aspects and so are marked as credible by EventOptCA.
Drawbacks of our methods • Deep NLP techniques can be used to obtain more accurate tweet implications. • Evidence about an event being rumorous may be present in tweets themselves. • Deep semantic parsing of tweets may be quite inefficient. • Entertainment events tend to look not so credible, as the news is often written in a colloquial way. • Predicting credibility for entertainment events may need to be studied separately.
Related Work • Credibility Analysis of Online Social Content • Credibility Analysis for Twitter • Fact Finding Approaches • Graph Regularization Approaches
Related Work • Credibility Analysis of Online Social Content • Credibility perception depends on user background (Internet-savvy nature, politically-interested, experts, etc.), author gender, content attribution elements like author’s image, type of content (blog, tweet, news website), and the way of presenting the content (e.g. presence of ads, visual design). • 13% Wikipedia articles have mistakes on average. [5] • In general, news > blogs> Twitter [27] • Credibility Analysis for Twitter • Not credible because (1) Friends/followers are unknown (2) Tracing origin is difficult (3) Tampering with original message while retweeting is easy. [21] • Truthy service from Indiana University (Crowd Sourcing) [19] • Warranting impossible. [22] • Classifier approach to establish credibility. [4]
Related Work • Fact Finding Approaches • TruthFinder [30] • Different forms of credibility propagation [17] • Extensions like hardness of claims, probabilistic claims [2,3,8,9,29] • Graph Regularization Approaches • Mainly studied in semi-supervised settings • Has been used for a lot of applications like webpage categorization, large-scale semi-definite programming, color image processing, etc. [12, 28, 32]
Conclusions • We explored the possibility of detecting and recommending credible events from Twitter feeds using fact finders. • We exploit • Tweet feed content-based classifier results. • Traditional credibility propagation using a simple network of tweets, users and events. • Event Graph based optimization to assign similar scores to similar events. • Using two real datasets, we showed that credibility analysis approach with event graph optimization works better than the basic credibility analysis approach or the classifier approach.
References [1] R. Abdulla, B. Garrison, M. Salwen, and P. Driscoll. The Credibility of Newspapers, Television News, and Online News. Association for Education in Journalism and Mass Communication, Jan 2002. [2] R. Balakrishnan. Source Rank: Relevance and Trust Assessment for Deep Web Sources based on Inter-Source Agreement. In International Conference on World Wide Web (WWW), pages 227–236. ACM, 2011. [3] L. Berti-Equille, A. D. Sarma, X. Dong, A. Marian, and D. Srivastava. Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence. In Conference on Innovative Data Systems Research (CIDR). www.crdrdb.org, 2009. [4] C. Castillo, M. Mendoza, and B. Poblete. Information Credibility on Twitter. In International Conference on World Wide Web (WWW), pages 675–684. ACM, 2011. [5] T. Chesney. An Empirical Examination of Wikipedia’s Credibility. First Monday, 11(11), 2006. [6] I. Dagan and O. Glickman. Probabilistic Textual Entailment: Generic Applied Modeling of Language Variability. In Learning Methods for Text Understanding and Mining, Jan. 2004. [7] I. Dagan, O. Glickman, and B. Magnini. The PASCAL Recognising Textual Entailment Challenge. In Machine Learning Challenges, volume 3944, pages 177–190. Springer, 2006. [8] A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating Information from Disagreeing Views. In ACM International Conference on Web Search and Data Mining (WSDM), pages 131–140. ACM, 2010. [9] M. Gupta, Y. Sun, and J. Han. Trust Analysis with Clustering. In International Conference on World Wide Web (WWW), pages 53–54, New York, NY, USA, 2011. ACM. [10] A. Hickl, J. Williams, J. Bensley, K. Roberts, B. Rink, and Y. Shi. Recognizing Textual Entailment with LCC’s GROUNDHOG System. In PASCAL Challenges Workshop on Recognising Textual Entailment, 2006. [11] E. Kim, S. Gilbert, M. J. Edwards, and E. Graeff. Detecting Sadness in 140 Characters: Sentiment Analysis of Mourning Michael Jackson on Twitter. 2009.
References [12] O. Lezoray, A. Elmoataz, and S. Bougleux. Graph Regularization for Color Image Processing. Computer Vision and Image Understanding, 107(1–2):38–55, 2007. [13] M. Mathioudakis and N. Koudas. TwitterMonitor : Trend Detection over the Twitter Stream. International Conference on Management of data (SIGMOD), pages 1155–1157, 2010. [14] MATLAB. version 7.9.0.529 (R2009b). The MathWorks Inc., Natick, Massachusetts, 2009. [15] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to theWeb. In International Conference on World Wide Web (WWW), pages 161–172. ACM, 1998. [16] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: Sentiment Classification using Machine Learning Techniques. In Empirical Methods in Natural Language Processing (EMNLP), pages 79–86. ACL, 2002. [17] J. Pasternack and D. Roth. Knowing What to Believe (When You Already Know Something). In International Conference on Computational Linguistics (COLING). Tsinghua University Press, 2010. [18] A. M. Popescu and M. Pennacchiotti. Detecting Controversial Events From Twitter. In ACM International Conference on Information and Knowledge Management (CIKM), pages 1873–1876. ACM, 2010. [19] J. Ratkiewicz, M. Conover, M. Meiss, B. G. calves, S. Patil, A. Flammini, and F. Menczer. Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams. arXiv, Nov 2010. [20] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake Shakes Twitter users: Real-Time Event Detection by Social Sensors. In International Conference on World Wide Web (WWW), pages 851–860. ACM, 2010. [21] M. Schmierbach and A. Oeldorf-Hirsch. A Little Bird Told Me, so I didn’t Believe It: Twitter, credibility, and Issue Perceptions. In Association for Education in Journalism and Mass Communication. AEJMC, Aug 2010. [22] A. Schrock. Are You What You Tweet? Warranting Trustworthiness on Twitter. In Association for Education in Journalism and Mass Communication. AEJMC, Aug 2010.
References [23] S. Soderland. Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning, 34:233–272, 1999. [24] A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In International AAAI Conference on Weblogs and Social Media (ICWSM). The AAAI Press, 2010. [25] P. Viola and M. Narasimhan. Learning to Extract Information from Semi-Structured Text using a Discriminative Context Free Grammar. In International ACM Conference on Research and Development in Information Retrieval (SIGIR), pages 330–337. ACM, 2005. [26] V. V. Vydiswaran, C. Zhai, and D. Roth. Content-Driven Trust Propagation Framework. In International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 974–982. ACM, 2011. [27] C. R. WebWatch. Leap of Faith: Using the Internet Despite the Dangers, Oct 2005. [28] K. Q. Weinberger, F. Sha, Q. Zhu, and L. K. Saul. Graph Laplacian Regularization for Large-Scale Semidefinite Programming. In Advances in Neural Information Processing Systems (NIPS), 2007. [29] X. Yin and W. Tan. Semi-Supervised Truth Discovery. In International Conference on World Wide Web (WWW), pages 217–226. ACM, 2011. [30] X. Yin, P. S. Yu, and J. Han. Truth Discovery with Multiple Conflicting Information Providers on the Web. IEEE Transactions on Knowledge and Data Engineering (TKDE), 20(6):796–808, 2008. [31] G. Zeng and W. Wang. An Evidence-Based Iterative Content Trust Algorithm for the Credibility of Online News. Concurrency and Computation:Practice and Experience (CCPE), 21:1857–1881, Oct 2009. [32] T. Zhang, A. Popescul, and B. Dom. Linear Prediction Models with Graph Regularization for Web-page Categorization. In International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 821–826. ACM, 2006.