270 likes | 284 Views
Motivation. Distributed Systems SecurityByzantine ConsensusSecure routing in DHTsReputation SystemsAssumption : bound on fraction of dishonest identitiesA single entity/user can pretend to have multiple identitiesSybil Attack . Sybil Attack. Sybil identities can own a large fraction of all identitiesRedundancy does not help! Distributed systems security solutions fail...BotnetsZombie machinesAverage size > 20,000 .
E N D
1. SybilInfer: How to Win the Zombie Wars! Prateek Mittal, George Danezis (MSRC Intern) (MSR Cambridge)
2. Motivation Distributed Systems Security
Byzantine Consensus
Secure routing in DHTs
Reputation Systems
Assumption : bound on fraction of dishonest identities
A single entity/user can pretend to have multiple identities
Sybil Attack
3. Sybil Attack
Sybil identities can own a large fraction of all identities
Redundancy does not help!
Distributed systems security solutions fail...
Botnets
Zombie machines
Average size > 20,000
4. How to bound the fraction of dishonest nodes?
Trusted Central authority
Distributed Solutions?
Resource constraints
CPU, Bandwidth etc
CAPTCHAS
Social Networks
5. Leveraging Social Networks Resource Constraint
bound on number of trust relationships between attackers and honest nodes
Attacker cannot create edges between honest nodes and Sybil identities
6. Leveraging Social Networks Social networks are Fast Mixing
Random walks quickly convergence to stationary distribution
Sybil attacks induce a bottleneck cut
Fast mixing is disrupted
Knowledge of an apriori honest node
Breaks Symmetry
Related to privacy in personalized searchRelated to privacy in personalized search
7. Related Work SybilGuard[SIGCOMM 06] & SybilLimit [Oakland 08]
Assumes short random walks lie mostly in the honest region
Results in poor threshold to colluding attackers
Heuristic validation approach
Honest nodes random walks intersect
Birthday paradox
High false negatives
Related to privacy in personalized searchRelated to privacy in personalized search
8. Our Approach: Design Philosophy
Optimal use of all information available in the graph
No assumptions on threshold of attack edges
9. Formal Model Properties of Mixing times
Depend on random walks
and where they end
Each vertex performs S random walks
length l =log(|V|)
Transition probability
Uniform stationary distribution (without attack)
Let T = set of vertex pairs <start vertex, end vertex> for each random walk
10. Formal Model Assign probabilities of cuts being honest
Using Bayes Theorem, we have that :
Next Challenge: Model
11. Formal Model
12. Estimating EXX / probxx We could sample EXX as well
P(X,EXX|T)
Expensive
Instead, we shall directly estimate the best EXX
13.
14. Sampling
Sample from above distribution
Marginal Probabilities
P(Node j is honest) = # j appears in samples/ #samples
Can label nodes as honest/dishonest
Sampling algorithm : Metropolis-Hastings
Current State : X0
Propose a new state X1 with probability Q(X1|X0)
Accept new state with probability
15. Security Evaluation Theoretical Guarantees
Synthetic Topologies
Real Data Sets
LiveJournal
DBLP
16. Theoretical Guarantees Ideal Scenario:
Without attack, the cuts obtained from model have EXX=0
Under attack, the cuts obtained from the model have EXX >0 regardless of attacker strategy
Real World:
Without attack, we obtain cuts with EXX approx 0 (upper bounded by Emax)
Under a major Sybil attack, we obtain cuts with EXX > Emax regardless of attacker strategy
17. Implementation 2 implementations
Roughly 1000 LOC
C++
Can handle about 50000 nodes
Python
Can handle about 10000 nodes
Performance
20 samples
< 60 s
18. Synthetic Topology Scale Free Topology
1000 nodes
100 nodes are malicious and colluding
Application
considers the set of nodes whose marginal probability of being honest is > 0.5
Attacker Goal: Try to insert x addition Sybil identities.
What is the optimal x?
19. Security Evaluation
22. LiveJournal Extract a social sub graph from LiveJournal
Three hop neighbourhood of a random node
Processing
Remove nodes with degree < 3
33170 nodes
Our model found a bottleneck cut is this topology
False positive or Sybil attack?
Remove the bottleneck cut
31603 nodes
24. DBLP Extract a social sub graph from DBLP
Two hops from IEEE S&P (Oakland)
about 10000 nodes
Process the extracted social network
Remove nodes with degree < 15
Impose upper bound on node degree (1.2 *mean)
2000 nodes
250 colluding nodes/300 Sybil identities
26. Applications SPAM
Facebook/Orkut
Tor
File Sharing
27. Future Work Scaling
Model works well for
Local 2/3 hop neighbourhoods
Interest groups
Will it work for multi million node topologies?
Better detection of small attacks
Distributed SybilInfer
Sybil Proof DHT
28. Conclusions Proposed a formal model for inferring Sybil identities in a Social Network
Proposed solution can be applied to security critical centralized/distributed applications
High tolerance to colluding adversary
Low false negatives