400 likes | 599 Views
The Sociology of Sybils: Understanding Social Network-based Sybil Defenses. Krishna P. Gummadi Networked Systems Research Group MPI-SWS. Sybil attack. A fundamental problem in distributed systems Attacker creates many fake/sybil identities Many cases of real world attacks : Digg, Youtube.
E N D
The Sociology of Sybils:Understanding Social Network-based Sybil Defenses Krishna P. Gummadi Networked Systems Research Group MPI-SWS
Sybil attack • A fundamental problem in distributed systems • Attacker creates many fake/sybil identities • Many cases of real world attacks : Digg, Youtube Automated sybil attack on Youtube for $147!
Sybil defense • Using a trusted central authority • Tie identities to actual human beings • Not always desirable • Can be hard to find such authority • Sensitive info may scare away users • Potential bottleneck and target of attack • Hard without a trusted central authority • Impossible unless using special assumptions [Douceur ’02] • Resource challenges using CPU, b.w., memory are not sufficient • Adversary can have much more resources than typical user • Need some resource that is hard to obtain in abundance • Links in a social network?
Sybil nodes Leveraging social networks:Basic insight • Resource Constraint • Bound on number of trust relationships between attackers and honest nodes • Attacker cannot create arbitrarily large # of edges between honest nodes and Sybil identities • Assumption: edges represent mutual trust • E.g., colleagues, relatives in real-world • Not online friends! honest nodes
Several proposals to leverage social nets • All rely on detecting the topological features resulting from the resource constraint • SybilGuard [Sigcomm ’06] • SybilLimit [Oakland S&P ’08] • Ostra [NSDI ’08] • SybilInfer [NDSS ’09] • SumUp [NSDI ’09] • Whanau [NSDI ’10] • MobId [INFOCOM ’10]
sybil nodes Example: SybilGuard The sub-graph of honest nodes is fast mixing Disproportionally small cut separating honest and Sybil nodes honest nodes Cannot search for such a cut using brute-force
How SybilGuard works:Random walk intersection • Verifier accepts a suspect if the two routes intersect • W.h.p., verifier’s route stays within honest region • W.h.p., routes from two honest nodes intersect • # of accepted Sybils < g*w • g: # of attack edges • w: random walk length Verifier Suspect honest nodes sybil nodes Random walk length w:
Another example: SumUp • A Sybil resilient vote aggregator • A central party collects all votes and the social graph • Goal: extract a subset of votes • include at most a few votes from Sybils • include most votes from honest users
Summary: Sybil defense schemes • A number of Sybil schemes already proposed • More with each passing conference • All schemes rely on two common assumptions • Honest nodes: they are fast mixing • Sybils: they do not mix quickly with honest nodes • But, each relies on its own graph analysis algorithm • E.g., back-traceable random walk intersection, bayesian inference from modified random walks, max-flow between nodes, betweenness centrality of nodes
Problem with state of the art • Fast mixing assumption provides little insight • Into how the schemes work • Or what structural properties affect their effectiveness • Neither does the evaluation of the Sybil algorithms • Lots of sensitive parameters that impact results • Each scheme evaluated on different data sets • Each scheme performs differently on different data sets • Evaluations assume different adversarial models
Rest of the talk • Investigate several unanswered questions: • How do the different schemes compare against each other? • Do they all find Sybils similarly? • What types of network structures are vulnerable to Sybil attacks? • How prevalent are such structures in real-world social networks? • And discuss their implications
Results summary • How do the different schemes compare against each other? • Do they all find Sybils similarly? • All Sybil schemes work by detecting tightly-knit node communities • What types of network structures are vulnerable to Sybil attacks? • When all honest nodes do not form a single cohesive community • How prevalent are such structures in real-world social networks? • Very prevalent! Real-world social communities have bounded size
Communities in social networks • Group of users more densely connected than overall graph
Results summary • How do the different schemes compare against each other? • Do they all find Sybils similarly? • All Sybil schemes work by detecting tightly-knit node communities • What types of network structures are vulnerable to Sybil attacks? • When all honest nodes do not form a single cohesive community • How prevalent are such structures in real-world social networks? • Very prevalent! Real-world social communities have bounded size
How Sybil defense schemes work • At their core, Sybil schemes partition the network • Into Sybils and non-Sybils • Partitioning algorithms can be viewed as ranking nodes • With a sliding cutoff determined by parameters
How Sybil defense schemes work • Ranking is independent of an algorithm’s parameters • Changing parameters yields different partitions
Comparing Sybil defense schemes • Compare their node rankings at different partitionings • How do the partitions formed by the first k nodes compare • Metric: Mutual information [Strehl ’02] • Varies between 0 and 1 • 0 => no correlation between the partitionings • 1 => perfect match
Comparing Sybil defense schemes • All Sybil schemes rank nodes in the local community before others • No correlation between rankings within or outside local community Toy topology with two well defined communities
Comparing Sybil defense schemes • Using a Facebook subgraph • Nodes from local community ranked before others • Little correlation between rankings within & outside the community
Comparing Sybil defense schemes • Using an Astrophysicist network • Nodes from local community ranked before others • Little correlation between rankings within & outside the community
Summary: Comparing Sybil defense schemes • All node rankings are biased towards decreasing conductance • When multiple nodes are similarly well connected, their orderings can vary in different schemes • Nodes in cohesive clusters around reference node are ranked before others in all schemes • Sybil defense schemes are effectively detecting communities!
Rest of the talk • Investigate several unanswered questions: • How do the different schemes compare against each other? • Do they all find Sybils similarly? • All Sybil schemes work by detecting tightly-knit node communities • What types of network structures are vulnerable to Sybil attacks? • How prevalent are such structures in real-world social networks? • And discuss their implications
What networks are vulnerable to Sybil attacks? • When non-Sybils are divided into multiple communities • Cannot tell apart Sybils & non-Sybils in a distant community • Attackers can launch very effective targeted attacks
Do non-Sybils form multiple communities? • Some real-world social networks have high modularity • They exhibit well defined community structures
Are networks with stronger community structures more vulnerable? • Yes! Networks with higher modularity are more susceptible to attacks • Independent of the Sybil defense scheme used
Rest of the talk • Investigate several unanswered questions: • How do the different schemes compare against each other? • Do they all find Sybils similarly? • All Sybil schemes work by detecting tightly-knit node communities • What types of network structures are vulnerable to Sybil attacks? • When all honest nodes do not form a single cohesive community • How prevalent are such structures in real-world social networks? • And discuss their implications
How often do non-Sybils form one cohesive community? • Traditional methodology: • Analyze several real-world social network graphs • Generalize the results to the universe of social networks • A more scientific method: • Leverage insights from sociological theories on communities • Test if their predictions hold in online social networks • And then generalize the findings
Group attachment theory • Explains how humans join and relate to groups • Common-identity based groups • Membership based on self interest or ideology • E.g., NRA, Greenpeace, and PETA • Tend to be loosely-knit and less cohesive • Common-bond based groups • Membership based on inter-personal ties, e.g., family or kinship • Tend to form tightly-knit communities within the network
Dunbar’s theory • Limits the # of stable social relationships a user can have • To less than a couple of hundred • Linked to size of neo-cortex region of the brain • Observed throughout history since hunter-gatherer societies • Also observed repeatedly in studies of OSN user activity • Users might have a large number of contacts • But, regularly interact with less than a couple of hundred of them • Limits the size of cohesive common-bond based groups
Prediction and implication • Strongly cohesive communities in real-world social networks will be necessarily small • No larger than a few hundred nodes! • If true, it imposes a limit on the number of non-Sybils we can detect with high accuracy • Will be problematic as social networks grow large
Verifying the prediction • In all networks, groups larger than a few 100 nodes do not remain cohesive • Small cohesive groups tend to be family and alumni groups • Large groups are often on abstract topics like music or politics Real-world data sets analyzed
Rest of the talk • Investigate several unanswered questions: • How do the different schemes compare against each other? • Do they all find Sybils similarly? • All Sybil schemes work by detecting tightly-knit node communities • What types of network structures are vulnerable to Sybil attacks? • When all honest nodes do not form a single cohesive community • How prevalent are such structures in real-world social networks? • Very prevalent! Real-world social communities have bounded size • And discuss their implications
Implications • Fundamental limits on social network-based Sybil defenses • Can reliably identify only a limited number of honest nodes • In large networks, limits interactions to a small subset of honest nodes • Might still be useful in certain scenarios, e.g., white listing email from friends • Social network-based Sybil defense is a misnomer!
Future directions • Leverage information beyond social network structure • E.g., inter-user activity can reveal the strength of ties and help eliminate links to Sybils • Move towards Sybil tolerance • Rather than preventing users from creating multiple identities • Focus on limiting privileges
Summary • We discussed social network-based Sybil defenses • Lots of proposed schemes, but little understanding • Of how they compare with each other • Or what structural properties impact them • Or how well they would work in real-world social networks • We found that Sybil schemes • Work by effectively detecting communities • Are vulnerable in networks with well defined community structures • Can find only a limited number of trustworthy nodes in real-world • Our findings suggest that we need to move beyond using only the social network to defend against Sybil attacks
Thanks! Questions? • Acknowledgements: • Joint work with Bimal Viswanath, Ansley Post, and Alan Mislove • Thanks to Haifeng Yu and Nguyen Tran for illustrations of SybilGuard and SumUp Sybil defense schemes