1 / 50

P roduction and Evaluation in social media

P roduction and Evaluation in social media. Social media: Knowledge Sharing. Online encyclopedia Online question-answer forum CrowdSourcing. Questions. User contributed content E.g. How does the coverage and content of wikipedia grow?

kynton
Download Presentation

P roduction and Evaluation in social media

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Production and Evaluation in social media

  2. Social media: Knowledge Sharing • Online encyclopedia • Online question-answer forum • CrowdSourcing

  3. Questions • User contributed content • E.g. How does the coverage and content of wikipedia grow? • What prompts the choice of the next articles to be written? • What strategies do the users use in choosing which crowdsourced tasks to contribute to • User evaluations • What are the mechanisms behind user evaluations? • Are there possibly simpler explanations of user-user or user-content evaluations? • Design questions • What are some principles on which such sites could be better designed? • Ex: How to reward user-user evaluation or user-content evaluation?

  4. Intro - Wikipedia • Free multilingual encyclopedia launched in 2001 • Operated by the non-profit Wikimedia Foundation • Contains 2,610,291 articles in English and 10 million in total • 236 active language editions • Content written by volunteers • No formal peer-review and changes take effect immediately • New articles are created by registered users but can be edited by anyone

  5. Intro - Wikipedia Source: Wikipedia Article Count from Jan, 2001 to Sep 2007

  6. The Collaborative Organization of Knowledge • Studies a dump of revisions over 6 years • Inflationary/Deflationary hypothesis • Does number of links to nonexistent articles increase at a higher rate than that of the new article creation? • Inflationary  maybe unusable at some point • Deflationary  growth stops at some point • How existing entries foster development of new entries? • Growth models: • http://en.wikipedia.org/wiki/Wikipedia:Modelling_Wikipedia's_growth

  7. Wikipedia growth *Incomplete include nonexistent articles and stubs Wikipedia is located in a midpoint between the two scenarios (thin coverage vs. decline in growth rate)

  8. References lead to Definitions Most articles written in the first month of reference Mean number of references to a nonexistent article rose exponentially until the article was created. Once article is created, references rise linearly or levels.

  9. Other findings • Growth of Wikipedia partly attributed to splitting of articles (depth in articles translate into breadth) • Deeply collaborative • Only in 3% of the cases is the reference creator the same as article writer • growth is limited by number of contributors not individual contributors! • Hypothesis: • Articles are more likely to be written because they are popular (have many references leading to them) than because contributor is interested • What kind of coverage does this growth pattern lead to? • Vandalism and reverts • 4% of article revisions were reverts • Average time to revert a vandalized page is 13 hours • 11% of pages that were reverted at least once had been vandalized at least once

  10. Strategic users

  11. Taskcn.com • 1.7 million registered users • In less than 2 years, requested solutions for nearly 3100 tasks and 543,000 solutions proposed • A user offers a monetary award for a question or task and other users provide solutions to compete for the award. • The website plays the role of the third party by collecting the money from the requester and distributing the award to the winner(s) who is (are) decided by the requester. • Website takes a small portion of the award as a service fee. • Socially stable: a core group of users who repeatedly propose and win

  12. User strategies develop over time • Some strategies used by winners • Submitting later ; although cannot see other submissions • Choosing less popular tasks • Choosing tasks with higher winning odds • Participate in tasks of increasing expected reward

  13. Selecting less popular tasks

  14. Summary • Users (“winners”) seem to be learning strategic behavior • No significant change winning rate happens for the “average” user • However, this is a core of winners who improve their chance of winning with more participation i.e. leading to quickening in the succession of wins

  15. User evaluations

  16. Evaluations • Evaluating items • Movie & product reviews • Other users • epinions, wikipedia voting • Items created by other users • Q&A websites, helpfulness votes on reviews

  17. Wikipedia voting • Admission to admin in WP is through voting • What drive user-user evaluation? • Status: whether V is higher/lower status than T • Level of recognition, merit, reputation in the community • Wikipedia: #edits, #barnstars • Similarity: whether users are similar • Overlapping topical interests of V and T • Wikipedia: similarity of articles edited

  18. Difference in status is more important that target status • When target status is high, we get a flat line • Different lines corresponding to different Δ Δ = difference in status

  19. Effect of similarity • Prior interaction/similarity boosts positive voting

  20. Status vs. similarity • Status is used as a “proxy” when users do not know about each other

  21. Summary • Evaluations are important part of social media • Important aspects • Status & similarity • Similarity breeds positive feedback • Status controls who shows up for feedback + the feedback itself • So much so that result of WP election can be predicted simply by who shows up!! • How can we make user evaluations truthful?

  22. Eliciting feedback: Challenges • No reporting • “inconvenience” cost of contributing • Dishonesty • niceness, fear of retaliation • conflicting motivations

  23. Eliciting feedback: Challenges • No reporting • “inconvenience” cost of contributing • Dishonesty • niceness, fear of retaliation • conflicting motivations • Reward systems • motivate participation, honest feedback • monetary (prestige, privilege, pure competition)

  24. Overcoming Dishonesty • Need to distinguish “good” from “bad” reports • explicit reward systems require objective outcome, public knowledge • stock, weather forecasting • But what if … • subjective? (product quality/taste) • private? (breakdown frequency, seller reputability)

  25. Overcoming Dishonesty • Need to distinguish “good” from “bad” reports • explicit reward systems require objective outcome, public knowledge • stock, weather forecasting • But what if … • subjective? (product quality/taste) • private? (breakdown frequency, seller reputability) • Naive solution: reward peer agreement • Information cascade, herding

  26. Peer Prediction: basic idea • reports determine probability distribution on other reports • reward based on “predictive power” of user’s report for a reference rater’s report • taking advantage of proper scoring rules, honest reporting is a Nash equilibrium

  27. Information Flow - Model announcement a PRODUCT type t CENTER (a) signal S transfer 

  28. Information Flow - Model announcement a PRODUCT type t CENTER (a) signal S transfer  PRODUCT type t CENTER (a) signal S announcement a transfer 

  29. Information Flow - Example h h l PLUMBER type ={H, L} signal = {h (high), l (low)} h h l $1 $1 $0 h h l (a) “agreement”

  30. Assumptions - Model PRODUCT type t {1, …, T} f (s | t) • common prior: distribution p(t) • common knowledge: distribution f(s|t) • linear utility stochastic relevance - fixed type - finite T

  31. Stochastic Relevance • Informally • same product, so signals dependent • certain observation (realization) should change posterior on type p(t), and thus on signal distribution f(s | t) • Rolex v. Faux-lex • generically satisfied if different types yield different signal distributions

  32. Stochastic Relevance • Informally • same product, so signals dependent • certain observation (realization) should change posterior on type p(t), and thus on signal distribution f(s) • Rolex v. Faux-lex • generically satisfied if different types yield different signal distributions • Formally • Sistochastically relevant for Sjiff: • distribution (Si | Sj) different for different realizations of Sj • there is sj such that

  33. Assumptions - Example • finite T: plumber is either H or L • fixed type: plumber quality does not change • common prior: p(t) • p(H) = p(L) = .5 • stochastic relevance: need good plumber’s signal distribution to be different than bad’s • common knowledge: f(s|t) • p(h | H) = .85, p(h | L) = .45 • note this gives p(h), p(l)

  34. Definitions - Example h h l h h l T(a) $1 $1 $0 PLUMBER type ={H, L} signal = {h, l} • 2 types, 2 signals, 3 raters • signals: S = (h, h, l) • announcements: a = (h, h, l) • transfers: (a) = (1 (a), 2(a), 3(a)) • announcement strategy for player 2: a2 = (a2h, a2l) • total set of strategies: (h, h), (h, l), (l, h), (l, l)

  35. Best Responses - Example • Player 1 receives signal h • Player 2’s strategy is to report a2 • Player 1 reporting signal h is a best-responseif t1(a1, a2) t2 (a1, a2) PLUMBER h or l a2 T(a) S1 = h S2 = ? Nash equilibrium if it holds for all users

  36. Peer Prediction • Find reward mechanism that induces honest reporting • where ai = Si for all i is a Nash equilibrium • Will need Proper Scoring Rules

  37. Proper Scoring Rules • Definition: • a scoring rule assigns to prob. vector S a score for each realization a • R ( S | a ) • Expected payoff maximized if S = true probs of {a} • Ensures truthtelling • What if there’s no public signal?

  38. Applying Scoring Rules • Definition: • a scoring rule assigns to prob. vector S a score for each realization a • R ( S | a ) • Expected payoff maximized if S = true probs of {a} • Ensures truthtelling • What if there’s no public signal? –use other peers • Now: predictive peers • Si = my signal, Sj = your signal, ai = my report • R (your report | my report )

  39. How it Works • For each rater i, we choose a different reference rater r(i) • Rater i is rewarded for predicting rater r(i)’s announcement • *i (ai, ar(i) ) = R( ar(i), ai) • based on updated beliefs about r(i)’s announcement given i’s announcement • Claim: for any strictlyproper scoring rule R, a reward system with payments *i makes truthful reporting a strict Nash equilibrium

  40. Peer Prediction Example • Player 1 observes low and must decide a1 = {h, l} • Using logarithmic scoring • t1(a1, a2) = R(a2 | a1) = log[ p(a2 | a1 )] • What signal maximizes expected payoff? • Note that peer agreement would incentivize dishonesty (h) PLUMBER p(H) = p(L) = .5 a1 = {h, l} a2 = s2 t1(a1, a2) T(a) S1 = l S2 = ? p(h | H) = .85 p(h | L) = .45 Pr[ l | l ] = 0.46 Pr[ h | l ] = 0.54

  41. Peer Prediction Example • Player 1 observes low and must decide a1 = {h, l} • Assume logarithmic scoring • t1(a1, a2) = R(a2 | a1) = log[ p(a2 | a1 )] PLUMBER p(H) = p(L) = .5 a1 = {h, l} a2 = s2 t1(a1, a2) T(a) S1 = l S2 = ? p(h | H) = .85 p(h | L) = .45 • a1 = l (honest) yields expected transfer: -.69 • a1 = h (false) yields expected transfer: -.75

  42. Things to Note • Players don’t have to perform complicated Bayesian reasoning if they: • trust the center to accurately compute posteriors • believe other players will report honestly • Not unique equilibrium • Could be other mixed strategies • collusion

  43. Primary Practical Concerns • Examples • inducing effort: fixed cost c > 0 of reporting • better information: users seek multiple samples • budget balancing • Basic idea: • affine rescaling (a*x + b) to overcome obstacle • preserves honesty incentive • Followup on trying to balance budget as much as possible

  44. Limitations • Collusion • could a subset of raters gain higher transfers? higher balanced transfers? • can such strategies: • overcome random pairings • avoid suspicious patterns • Understanding/trust in the system • complicated Bayesian reasoning, payoff rules • rely on experts to ensure public confidence

  45. Discussion • Is the common priors assumption reasonable? • How might we relax it and keep some positive results? • What are the most serious challenges to implementation? • Can you envision a(n online) system that rewards feedback? • How would the dynamics differ from a reward-less system? • How would we extend such formalization to more general feedback systems? • creating + feedback • Can we incorporate observed user behaviors?

  46. Thanks

  47. Common Prior Assumption • Practical concern - how do we know p(t)? • Theoretical concern - are p(t), f(s|t) public? • raters trust center to compute appropriate posterior distributions for reference rater’s signal • rater with private information has no guarantee • center will not report true posterior beliefs • rater might skew report to reflect appropriate posteriors • report both private information and announcement • two scoring mechanisms, one for distribution implied by private priors, another for distribution implied by announcement

  48. Best Responses - Model • Each player decides announcement strategy ai • ai is a best-response to other strategies a-i if: • Best-response strategy maximizes rater’s expected transfer with respect to other raters’ signals … conditional on Si = sm • Nash equilibrium if equation holds for all i T(a)

  49. Definitions - Model • T types, M signals, N raters • signals: S = (S1, …, SN), where Si = {s1, …, sM} • announcements: a = (a1, …, aN), where ai= {s1, …, sM} • transfers: (a) = (1 (a), …, N(a)) • announcement strategy for player i: ai = (ai1, … aiM) (a)

More Related