1 / 16

Ranking Tweets Considering Trust and Relevance

Ranking Tweets Considering Trust and Relevance. Srijith Ravikumar,Raju Balakrishnan, and Subbarao Kambhampati Arizona State University. 1. One of the most prominent micro-blogging service.

foy
Download Presentation

Ranking Tweets Considering Trust and Relevance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ranking Tweets Considering Trust and Relevance • Srijith Ravikumar,Raju Balakrishnan, and Subbarao Kambhampati • Arizona State University 1

  2. One of the most prominent micro-blogging service. • Twitter has over 140 million active users and generates over 340 millions tweets daily and handles over 1.6 billion search queries per day. • Users access tweets by following other users and by using the search function. 2

  3. Twitter Search Results for the Query: “Britney Spears” • Sorted by Reverse Chronological Order • Select the top retweeted single tweet as the top Tweet. • Does not apply any relevance metrics. • Contains spams and untrustworthy tweets. 3

  4. TweetRank Query Query TweetRank Top K Results Top N Results Acts as a mediator between User and Twitter K is much higher than N and thereby we are able to eliminate untrustworthy results. 4

  5. Need for Relevance and Trust Spread of False Facts in Twitter has become an everyday event • Re-Tweets and users can be bought. • Thereby making relying on those for trustworthiness does not work. 5

  6. Getting Relevant & Trustworthy Results • Manual curation is out of question.. (unless you are the • Government of China :-) ) • How many would it take to clean up a micro-blog with140 million active users? • Automated analysis? • Page Rank uses the explicit links between the Web Pages for evaluation of Trust and Relevance. But what are the links between tweets? 6

  7. Links in Twitter Space Agreement Retweet Re-Tweet: Explicit links between tweets Agreement: Implicit links between tweets that contain the same fact 7

  8. Agreement • Agreement between two tweets is defined as amount of similarity in their content. • Retweets are not considered in Agreement as Retweets are unverified endorsements. • How does agreement Capture Relevance and Trust? • A tweet which is agreed upon by a large number of other tweets is likely to be popular. The popular tweets are more likely to be Relevant. • Since agreement does not include retweets, most agreed tweet has most number of independent users agreeing on the same fact and hence they are more trustworthy. 8

  9. Agreement Computation • For efficient computation of agreement we need to understand the meaning of each tweet. This need Natural Language Processing. • As a preliminary idea, we compute agreement using Soft TF-IDF with Jaro-Winkler similarity. • Soft TF-IDF is similar to TF-IDF except it considers similar tokens in two compared document vectors in addition exactly similar terms. 9

  10. Computing Ranked Results • Simple voting technique is used to compute the Ranked Results. • The Agreement of a tweet is the sum of the agreement with all others tweets. • The tweets are sorted according to Agreement voting and Top-N results are send to user. 1.3 1.0 .6 1 2 .7 .4 0.0 3 10

  11. Results: Britney Spears 11

  12. Evaluation - Relevance • Top N results where manually labelled as follows: 12

  13. Evaluation - Trust • Top N results where manually labelled as follows: 13

  14. Ranking Cost • The time increases quadratically with the number of tweets. • Since the computation of agreement is pairwise it can be easily parallelized using MapReduce. 14

  15. Twitter Eco-System Tweeted URL Tweeted By Followers Hyperlinks 15

  16. Summary • Micro-blog spamming is increasingly becoming lucrative and problematic. • We are working on a ranking sensitive to trustworthiness and relevance of Micro-blogs. • We model the tweet space as a tri-layer graph; containing tweet layer, user layer and web-page layer. • Ranking is derived based on users, tweets, and prestige of the referred web pages. 16

More Related