1 / 16

Challenges in Mining Social Media Sparsity and Quality

Challenges in Mining Social Media Sparsity and Quality. Dagstuhl Seminar 11171 Challenges in Document Mining Thomas Gottron gottron@uni-koblenz.de. Social Media. Definition from Wikipedia:

geordi
Download Presentation

Challenges in Mining Social Media Sparsity and Quality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges in Mining Social Media Sparsity and Quality Dagstuhl Seminar 11171 Challenges in Document Mining Thomas Gottrongottron@uni-koblenz.de

  2. Social Media • Definition from Wikipedia: Social media are media for social interaction, using highly accessible and scalable communication techniques. Social media is the use of web-based and mobile technologies to turn communication into interactive dialogue. … Social authority is developed when an individual or organization establishes themselves as an "expert" in their given field or area, thereby becoming an influencerin that field or area. • Prominent examples: • Media sharing, collaboration, reviews, communication • MicroBlogging • Twitter & Co. (also Facebook)

  3. Microblogging – Twitter RT @janedoe: My dear @johndoe had troubles to wake up this #morning Followers @janedoe My dear @johndoe had troubles to wake up this #morning

  4. Microblogging – Sparsity • Twitter: 140 characters, few terms 85% of all tweets do not contain any term more than once

  5. Microblogging – Quality • Facets/aspects of quality: • Question: Which is the best Online RSS Reader? I need some recommendations, cheers everyone :) • My kitten is pretending to be a laptop • imon the phone rite now • Interesting timeline of major events in the history of information retrieval http://tinyurl.com/ya7rcqt Purpose (interaction, news propagation, etc.) Presentation (humor, irony, etc.) Language (writing style) Interestingness

  6. Measuring Quality? • (Social) Network measures • PageRank • Clustering coefficients • Centrality measures. • Quality of people, not messages!

  7. Retweets • Sign of quality • interesting for wider audience • Depends on • Content  • Social network  • Number of followers • Activity of followers • Content based retweet prediction Odds of retweet as sign of quality

  8. Retweets – Features

  9. Retweets – Prediction Model • Logistic regression • Model parameters learned on training data

  10. Feature Weights

  11. Feature Weights – Topics

  12. Application: Tweet retrieval • Query: „beer“

  13. Application: Tweet retrieval • Rerank top-100 according to retweet-odds

  14. Application: Tweet retrieval – Evaluation

  15. Summary & Outlook • Microblogging • Data sparsity in short messages • Quality is an issue • Interestingness: (one) sign of quality • Use Retweet odds for better ranking • Notion of content quality • Influence / Potential influence of users.

  16. Thank you! Contact: WeST – Institute for Web Science and Technologies Universität Koblenz-Landau gottron@uni-koblenz.de

More Related