430 likes | 534 Views
Collusion-Resistance Misbehaving User Detection Schemes. Speaker: Jing-Kai Lou. Outline. Introduction What’s the problem Does it matter Previous work: What have I done … Community-based scheme Current Analysis: What am I doing … HITS Random walk scheme.
E N D
Collusion-Resistance Misbehaving User Detection Schemes Speaker: Jing-Kai Lou
Outline • Introduction • What’s the problem • Does it matter • Previous work: What have I done … • Community-based scheme • Current Analysis: What am I doing … • HITS • Random walk scheme
The Rise of User Generated Content • Most of the fastest-growing sites on the internet now are based on user-generated content (UGC). Customer Reviews Increase Web Sales --- eMarketer
Inappropriate UGC • The misbehaving users • post the inappropriate UGC • Hiring lots of official moderators • is the typical solution • But, such high labor cost is a great burden to the service provider • There is another choice …
Social Moderation System X • A user-assist Moderation • Every user is a reviewer O X X Official moderator inspects what you see ? !? ?? You report what you see while viewing Blogger Album Video
Social Moderation Effect • Advantages of social moderation system: • Fewer official moderators • Detecting inappropriate content quickly • The number of the reports is still large.1%uploading photos in Flickr are problematic, there are still about 43,200 reports each day • An automation scheme to filter the reports
Automated Filter for Reports • Sorting the reports by their number of accusations 37 3 47 These photos are reported more than (N =20) times These photos are reported no more than (N =20) times
Not All Users Are Trustable • While most users report responsibly, colluders report fake results to gain some benefits
The Objective • To develop a collusion-resistant scheme • CAN automatically infers whether the accusations are fair or malicious. The scheme, therefore, distinguish misbehaving users from victims.
Our Work: Graph Theory Approach • Using the report (accusation) relation only • Previous work: Community-based Scheme • Submitted to 3rd ACM workshop on Scalable Trust Computing (STC 2008) • Extended work: • Propose new schemes • Analyzing new schemes…
Community-based Scheme • Achieving accuracy rate higher than 90% • Preventing at least 90% victims from collusion attack
Idea of Community-based Scheme • Accusation Relation: Accusing Graph:
Ideal Patterns Colluder Normal user Victim Misbehaving user
Accusing Community • Users with similar accusing tend to bein the same community Inter-community edge
Designing Features for Each User • To find accusations NOT from colluders • Base on the communities, we design features • Incoming Accusation, IA(k) = 2, • Outgoing Accusation, OA(k) = 5 k
Community-based Algorithm • Partitioning accusing graph into communities. • Computing the feature pair (IA, OA) of each user • Clustering based on their (IA, OA) pairs, and label users in the cluster with large (IA, OA) as misbehaving users.
Evaluation Metric • What we care is, False Negative • Misidentifying victims as misbehaving users • Collusion Resistance
Effect of #(Misbehaving users) Our Method Count-based Method
Effect of #(Colluders) Our Method Count-based Method
Effect of Accusation Density Our Method Count-based Method
Weakness of Community-based scheme • In our simulation, the colluders only accuse the victims. • Realistically, the colluders sometimes may also vote some misbehaving users. • We shall consider smart colluder
Smart Colluder Behavior • Behavior :=probability for colluder to vote misbehaving users, ranges from 0 to 100. Normal user Naïve Colluder Smart Colluder Behavior 0 100
Inspiration • A link analysisalgorithm that rates Web pages, developed by Jon Kleinberg. • It determines two values for a page: • its authority, which estimates the value of the content of the page, • and its hub value, which estimates the value of its links to other pages.
Ideal • Authority Victim • Hub value Colluder • For example, • Number of User = 150 • Misbehaving User Ratio = 10%, i.e., 15 • Colluder Ratio = 20%, i.e., 30 • Behavior = 20%
When Behavior is increasing • Parameter: • Number of User = 150 • Misbehaving User Ratio = 10%, i.e., 15 • Colluder Ratio = 20%, i.e., 30 • Behavior = 50%
Main Idea • Focusing on content accused by many reviewers • Creating undirected graph C to describe them and their relation • Shaping C, (named it as D) to satisfy the Goal • Goal:Putting many people walking several steps on D, then most of people would stay on “victims” finally
Co-Voter Graph, C • Define a co-voter graph C(V, E) to describe the relation between all accused • V(G): accused • E(G): • if the intersection of accusers against accused i and j (vertex i and j), then (i, j) in E(G) • weight, w(i. j) = #(intersection of accusers)
A snap shot of co-voter graph 1, 12, 13, 14 1, 2, 3, 4, 5, 6, 7, 8 5,6,7,8 B A F C D E 1, 2, 4, 8, 9, 10 5, 6, 7 5, 7,8
Making Ideal Tendency (Be Directed) Key Node Key Node M V Strong Weak M’ V’ GOAL: For M, 2 > 1 For V, 3 > 2
Goal 1: Intersection Ratio Prob. to V M V Prob. to M M’
GOAL 2: Alpha of Target • Alpha(M) < Alpha(V), hopefully Prob. to M = Alpha(M) M b Prob. to V = Alpha(V) V
What should be Alpha? • [Version N(eighborhood)]: Alpha(T) := the number of co-voters between b and all its neighborsColluder tend to share more co-voters with his collusion group … • [Version H(ub)]: Alpha(T) := Sum(hub score of T’s voter)
Weight Formula Options • Directed weight formula: w(a, b) =Alpha(b) * |a intersect b| / |a union b| • Then, we set the node leaving prob. by normalizing outgoing weight A C Pr(X A) = .4 Pr(X B) = .2 Pr(X C) = .4 0.8 B X 0.8 0.4
Evaluation • Parameter: • Number of User = 250 • Misbehaving User Ratio = 10%, i.e., 25 • Colluder Ratio = 20%, i.e., 50 • Behavior = 50%
Conclusion • Any new factor we shall consider? • Any idea to improve the random walk scheme, or HITS Scheme? • Any NEW idea?