Reputation Systems for Open Collaboration

Reputation Systems for Open Collaboration CACM 2010 Bo Adler, Luca de Alfaro, AshutoshKulshreshtha, Ian Pye Reviewed by : Minghao Yan

Reputation Systems Introduction • Open Collaboration: • Egalitarian, meritocratic, self-organizing • Efficient, but with challenges • quality: spam, vandalism • trust: how much you can rely on that? • Reputation Systems: • computes reputation scores for objects within a domain, based on the content of themselves or the external ratings. • help stem abuse • offer indications of content quality • regulates people’s interaction in open collaboraion • Relevance to our course content • recommendation system • PageRank and HITS are “page” reputation systems

Reputation Systems Content-driven vs. User-driven

Reputation Systems WikiTrust • a reputation system for wiki authors and content • goals: • incentivize users to give lasting contributions • help increase quality of content and spot vandalism • offer guide to quality of content • consists of: • user reputation system • gain reputation: when user making edits preserved later • lose reputation: when their edits undone by other users in future • content reputation system • gain reputation: when revised by high-reputation user • lose reputation: when disturbed by edits

Reputation Systems User Reputation System • assumptions: • sequence of revisions made by different author • possible to compare and measure the difference of two revisions • possible to track unchanged content across revisions • user reputation: • quality and quantity of contributions they make • contribution quality: • good quality: the change is preserved in subsequent revisions • bad quality: the change is rolled back in subsequent revisions • measure on how good the contribution is?

Reputation Systems Contribution Quality • relies on an edit distance function d: • d(r,r’) = how many words have been deleted, inserted, replaced and displaced from r to r’ • language independent b: the current revision a: a past revision c: a future revision -1 <=q( b | a, c ) <= 1 q( b | a, c ) = 1 : revision b fully preserved q( b | a, c ) = -1 : revision b fully reverted unable to judge newly created revisions!

Reputation Systems User Reputation • only consider non-negative reputation values • new user assigned reputation close to 0 • calculating revision: • 5 subsequent, 5 preceding, 2 previous by high-reputation author and 2 previous with high average text reputation • why? – to let it be difficult to subvert • calculating user reputation: • r(B) = k * d(a,b) * q(b | a,c) * log(r(C)) • r(B) is reputation increment of author B of revision b • r(C) is reputation of author C of revision c • why using logarithm? – balances the influence of reputation contribution between users

Reputation Systems User Reputation • resistant to manipulation • only way to damage reputation is to revert revision • maintain fairness, resistant to sybil attack • increase reputation of B only if C has higher reputation • sybil attack – creating fake identities to gain reputation • evaluation • ability of using user reputation to predict quality of future contribution • recall is high: high-reputation user are unlikely to be reverted • precision is low: many novice authors make good contributions

Reputation Systems Content Reputation • informative, robust, explainable • how ? – according to which the content has been revised, and the reputation of the author of the revision • edit part – assigned small faction of the author’s reputation • unchanged part – gains reputation • tweaks • deleting, re-arranging text – low reputation mark • raise reputation only up to author’s own reputation • associate word with last few editing authors who raised the text’s reputation • block moves • adopting edit distance weight

Reputation Systems Crowdsensus • a reputation system to analyze user edits to Google Maps • goals • measure accuracy of users contributing information • reconstruct possible correct listing information • design space • relies on the existence of ground truth • user reputation is not visible • identity notion is stronger • global computation is possible

Reputation Systems Crowdsensus • input • triple(u, a, v) – user u asserts attribute a has value v • structure– fixpoint graph algorithm • vertices are users and attributes • for each (u, a, v), insert an edge valued v from u to a and back • each user vertex is associated with a truthfulness value qu • iterations • all qu are initialized to an a-priori default • user vertex send (q, v) pairs to attribute vertex • attribute inference algorithm to derive the probability distribution over (v1, v2, ..., vn) • send back the user vertex the probability of vi is correct • truthfulness inference algorithm estimates the truthfulness of users • go for another iteration

Reputation Systems Crowdsensus • heart of crowdsensus – attribute inference algorithm • standard algorithm – Bayesian inference • bad for real cases • information are not independent • business attributes have different characteristics • complete system • for multiple correct value attributes • dealing with spam • protecting system from abuse • integrated with other data pipeline components

Reputation Systems Design Space • content-driven vs. user-driven • reputation system visible to user? • week identity vs. strong identity • existence of ground truth • affect which algorithm used • chronological vs. global reputation updates • global model can utilize information in graph topology (PageRank, HITS) • chronological model can leverage past and future to prevent attack(sybil attack)

Reputation Systems Design Space

Reputation Systems Conclusion • reputation systems are the on-line equivalent of the body of laws regulates real-world people interactions • reputation systems provide ways for users to evaluate content and improve trust level • design of reputation systems should leverage different aspects • reputation systems should be robust, and invulnerable to attacks (or their is no trust) • reputation systems with population-dynamic approach • reputation systems with multiple goals

Reputation Systems Pros • well defined reputation systems characteristics and goals • discussion on design aspects and influence on reputation systems • detail level wikitrust implementation tweaks for preventing system from abuse and attacks • comparison of two content-driven systems well illustrated and supported the discussion of system design considerations • provided good evaluation measures of systems accuracy on wiki real data

Reputation Systems Cons • lack of deeper explanation of algorithms in Crowdsensus • lack of evidence of Crowdsensus algorithm’s better performance than standard Bayesian inference on real data • lack of comparison between user-driven and content-driven model’s performance and how these two can work together

Reputation Systems for Open Collaboration

Reputation Systems for Open Collaboration

Presentation Transcript

Entropy balance for Open Systems

Collaboration Systems

Manipulation Resistant Reputation Systems

Symbolic Equivalences for Open Systems

Principles for Collaboration Systems

Reputation Systems

Open Reputation Management Systems TC (ORMS)

Open systems

Reputation Systems for Anonymous Networks

Open Systems:

Health IT Systems - “Collaboration, Open Solutions, and Innovation”

Open Reputation Systems

Open Reputation Systems

Reputation Systems For Open Collaboration, CACM 2010 Bo Adler, Luca de Alfaro et al.

Polycom Open Collaboration Network

Reputation systems

Entropy balance for Open Systems

An Open Collaboration Framework

Reputation Systems for Anonymous Networks

Principles for Collaboration Systems

Health IT Systems - “Collaboration, Open Solutions, and Innovation”

Reputation systems