Ranking Tweets Considering Trust and Relevance

Ranking Tweets Considering Trust and Relevance • Srijith Ravikumar,Raju Balakrishnan, and Subbarao Kambhampati • Arizona State University 1

One of the most prominent micro-blogging service. • Twitter has over 140 million active users and generates over 340 millions tweets daily and handles over 1.6 billion search queries per day. • Users access tweets by following other users and by using the search function. 2

Twitter Search Results for the Query: “Britney Spears” • Sorted by Reverse Chronological Order • Select the top retweeted single tweet as the top Tweet. • Does not apply any relevance metrics. • Contains spams and untrustworthy tweets. 3

TweetRank Query Query TweetRank Top K Results Top N Results Acts as a mediator between User and Twitter K is much higher than N and thereby we are able to eliminate untrustworthy results. 4

Need for Relevance and Trust Spread of False Facts in Twitter has become an everyday event • Re-Tweets and users can be bought. • Thereby making relying on those for trustworthiness does not work. 5

Getting Relevant & Trustworthy Results • Manual curation is out of question.. (unless you are the • Government of China :-) ) • How many would it take to clean up a micro-blog with140 million active users? • Automated analysis? • Page Rank uses the explicit links between the Web Pages for evaluation of Trust and Relevance. But what are the links between tweets? 6

Links in Twitter Space Agreement Retweet Re-Tweet: Explicit links between tweets Agreement: Implicit links between tweets that contain the same fact 7

Agreement • Agreement between two tweets is defined as amount of similarity in their content. • Retweets are not considered in Agreement as Retweets are unverified endorsements. • How does agreement Capture Relevance and Trust? • A tweet which is agreed upon by a large number of other tweets is likely to be popular. The popular tweets are more likely to be Relevant. • Since agreement does not include retweets, most agreed tweet has most number of independent users agreeing on the same fact and hence they are more trustworthy. 8

Agreement Computation • For efficient computation of agreement we need to understand the meaning of each tweet. This need Natural Language Processing. • As a preliminary idea, we compute agreement using Soft TF-IDF with Jaro-Winkler similarity. • Soft TF-IDF is similar to TF-IDF except it considers similar tokens in two compared document vectors in addition exactly similar terms. 9

Computing Ranked Results • Simple voting technique is used to compute the Ranked Results. • The Agreement of a tweet is the sum of the agreement with all others tweets. • The tweets are sorted according to Agreement voting and Top-N results are send to user. 1.3 1.0 .6 1 2 .7 .4 0.0 3 10

Results: Britney Spears 11

Evaluation - Relevance • Top N results where manually labelled as follows: 12

Evaluation - Trust • Top N results where manually labelled as follows: 13

Ranking Cost • The time increases quadratically with the number of tweets. • Since the computation of agreement is pairwise it can be easily parallelized using MapReduce. 14

Twitter Eco-System Tweeted URL Tweeted By Followers Hyperlinks 15

Summary • Micro-blog spamming is increasingly becoming lucrative and problematic. • We are working on a ranking sensitive to trustworthiness and relevance of Micro-blogs. • We model the tweet space as a tri-layer graph; containing tweet layer, user layer and web-page layer. • Ranking is derived based on users, tweets, and prestige of the referred web pages. 16

Ranking Tweets Considering Trust and Relevance

Ranking Tweets Considering Trust and Relevance

Presentation Transcript

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements

Ranking Multimedia Databases via Relevance Feedback with History and Foresight Support

Ranking Tweets Considering Trust and Relevance Srijith Ravikumar , Raju Balakrishnan, Subbarao Kambhampati srijith@asu.

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements

Tweets

Trust and Profit Sensitive Ranking for On-line Ads and Web Databases

Tweets

RIGOR AND RELEVANCE

Effective XML Keyword Search with Relevance Oriented Ranking

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements

Considering

Relevance Ranking in the Scholarly Domain

Ranking Documents based on Relevance of Semantic Relationships

Relevance Ranking and Clustering

Ranking, Trust, and Recommendation Systems: An Axiomatic Approach

XQuery Processing with Relevance Ranking

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements

Ranking by Relevance and/or Similarity in HunCRIS

Likes Tweets and SEO