1 / 27

TWinner : Understanding News Queries with Geo-content using Twitter

TWinner : Understanding News Queries with Geo-content using Twitter. Satyen Abrol,Latifur Khan University of Texas at Dallas,Department of Computer Science GIR ’1 0. 29 April, 2011 Sengyu Rim. Outline. Introduction Related Work Twitter as News-wire Determining News Intent

chavi
Download Presentation

TWinner : Understanding News Queries with Geo-content using Twitter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TWinner: Understanding News Queries with Geo-content using Twitter Satyen Abrol,Latifur Khan University of Texas at Dallas,Department of Computer Science GIR ’10 29 April, 2011 SengyuRim

  2. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion 2/26

  3. Introduction • Motivations • Users find news through search engines • The search results of common search engines are different from the user expected • Non-critical information • Unorganized content • Necessary for search engines to understand the intend of the user query

  4. Introduction Motivation E.g what event in Korea attracted most attention in 2002? A naive user is searching the news with keyword “korea” on 2002.06-18 Food: Kimchi Map: korea News: Korea:Italy 2:1 Wiki: Korea 4/26

  5. Introduction • Analyze the content of a popular social networking site, Twitter to know the intention of the user query • Twitter provides popular news topics • Twitter provides keywords that may enhance the user query • TWinner makes two novel contributions to the field of Geographic information retrieval • Identifying the intent of the user query • Adding additional keywords to the query

  6. Introduction • The architecture of the news intent system Twinner

  7. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  8. Related Work • To identify and disambiguate the locations of users • Natural Language Processing • Data Mining • To establish the relationship between the location of the news and news content • A model using NLP techniques

  9. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  10. Twitter as News-wire • Twitter • Free social networking • Micro-blogging service • Medium for news updates

  11. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  12. Determining News Intent • Identification of Location • Geo-tags the query to a location with certain confidence • Frequency-Population Ratio • FPR always remains constant in the absence of a news making event irrespective of the location • Used to assign a news intent confidence to the query • FPR = (α + β) * Nt • α: the population density factor • β: location type constant • Nt:the number of tweets per minute at that instant

  13. Determining News Intent • Experiments on determining the effect of geo-type and population density

  14. Determining News Intent • The drawback of FPR • Fails to take into account the geographical relatedness of features • Modified FPR • FPR = Σ δi (αi + βi) * Nt • δi: factor that each geo-location related to the primary search query

  15. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  16. Assigning Weights to Tweets • Detecting Spam Messages • Spam messages carry little or no relevant information • Nature of spam messages • The formula that tags to a certain level of confidence whether the message is spam or not • Np: the number of followers • Nq: the number of people the user is following • μ: an arbitrary constant • Nr: the ratio of number of tweets containing a reply to the total number of tweets

  17. Assigning Weights to Tweets • On basis of user location • The experiment conducted to understand the relation between Twitter messages and the location of the user

  18. Assigning Weights to Tweets • Using Hyperlinks Mentioned in Tweets • 30-50% of the general Twitter messages contain a hyperlink to external website • The news Twitter messages of this percentage increases to 70-80% • We also make use of this pointer to assign the weights to tweets

  19. Assigning Weights to Tweets • Semantic Similarity • Summarize the Twitter messages into a couple of keywords • Naïve approach picks k keywords ignoring the sematic similarity • The definition of the semantic similarity • M: the total number of articles searched in New York Times Corpus • f(x): the number of articles for term x • f(y): the number of articles for term y

  20. Assigning Weights to Tweets • Reassigns the weight of all keywords on the basis of the following formula • Wi*= Wi + ΣSij* Wj • Wi*: the new weight of the keyword i • Wi: the weight without semantic similarity • Sij: the semantic similarity derived from semantic formula • Wj : the initial weight of the other words being considered • Identifies k keywords that are semantically dissimilar but together contribute maximum weight. • Spq<Sthreshold, the similarity between any two word(p) and word(q) belonging to the set of k is less than a threshold • W1+W2+W3+….+Wk is maximum for all groups satisfying the condition above mentioned

  21. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  22. Experiment and Results • Experiments-to see the validity of the hypothesis • First: a naïve user is looking for the latest on the happenings in the context to the Ford Hood incident on 12th November 2009 • Second: a naïve user is looking for the latest on the happenings in the context to ‘Russia’ on 5th December 2009 • Third: :a naïve user is looking for the latest on the happenings in the context to ‘Haiti’ on 18th January 2010

  23. Experiment and Results • Results

  24. Experiment and Results • Result-shows the contrast in search results produced by using original query and after adding keywords obtained by TWinner

  25. Outline • Introduction • Related Work • Twitter as News-wire • Determining News Intent • Assigning Weights to Tweets • Experiments and Results • Conclusion

  26. Conclusion • We present a system to predict a user’s news intent • Takes location mentioned and time of query into consideration • Makes use of the social networking site Twitter to understand the relationship between geo-information and the news intend of the query • Future work • Understanding the content of the social media message • Sentiment conveyed by the messages • Enhancing the accuracy of the system

  27. Thank you!

More Related