Defending Sybil Attack in Peer2Peer Networks

Defending Sybil Attack in Peer2Peer Networks Distributed Search Techniques Md. Tanvir Al Amin 04 09 05 2064 Shah Md. RifatAhsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed

launch sybil attack Sybil Attack • A fundamental problem in distributed systems. • Single user assumes many fake/sybil identities • Already observed in real-world p2p systems • Sybil identities can become a large fraction of all identities • “Out-vote” honest users in collaborative tasks honest malicious

Sybil attack • Present in both Application level and P2P Networking • Attacker creates many fake/sybil identities • Many cases of real world attacks : Digg, Youtube • Several research works shown how easy it was to subvert DHT like Chord or Kademlia using Sybil Attack Automated sybil attack on Youtube for $147!

Defending against Sybil attacks • Traditional solutions rely on central trusted authorities • Runs counter to open membership policies of OSNs • Recent proposals leverage social networks • Lots of research activity recently • Each optimized under assumptions about the graph structure • Each evaluated on different datasets SybilGuard [SIGCOMM’06] SybilLimit [Oakland’08] Ostra [NSDI’08] SumUp [NSDI’09] SybilInfer [NDSS’09] Whanau [NSDI’10] MobID [INFOCOM’10] All schemes analyze the graph structure to isolate Sybils

Defending against Sybil attacks • Recent proposals leverage social networks • Key Insight: Social links are hard to acquire in abundance • Look for small cuts in the graph • Conversely, look for communities around known trusted nodes • Dunbar’s Number • Power law node degrees Links difficult to create

How Do Social Networks look like

SybilGuard: Defending Against Sybil Attacksvia Social Networks Sybilguard is a system for detecting Sybil nodes in social graphs. Features of Sybil Guard • SybilGuard enables an honest node to identify other nodes • Verifier node V can verify if suspect node S is malicious • Guaranteed bound on number of sybil groups • Guaranteed bound on size of sybil groups • Completely decentralize Key Insight: 1. Use a social network to limit Sybils 2.Social links are hard to acquire in abundance 3.Look for small cuts in the graph DBLP Network

Dunbar’s number • Limits the # of stable social relationships a user can have • To less than a couple of hundred • Linked to size of neo-cortex region of the brain • Observed throughout history since hunter-gatherer societies • Roughly reported to be 150 • Also observed repeatedly in studies of OSN user activity • Users might have a large number of contacts • But, regularly interact with less than a couple of hundred of them

Power-law node degrees U.S. highways U.S. Airlines

Path lengths and diameter • all major networks have short path length from 4.25 – 5.88 • six degrees of separation Facebook, 4.2 million for Octorber 2007, 6.12 from http://blog.paulwalk.net/2007/10/08/no-degrees-of-separation/

Implications of Path lengths and diameter The small diameter and path lengths of social networks are likely to impact the design of techniques for finding paths in such networks

Link degree correlations • high-degree nodes tend to connect to other high-degree nodes ? OR • high-degree nodes tend to connect to low-degree nodes ? • In real society: the former theory is true. • By virtue of two metrics: the scale-free metric and the assortativity. • Suggests that there exists a tightly-connected “core” of the high-degree nodes which connect to each other, with the lower-degree nodes on the fringes of the network. • The next question: How big the core is

Implications of Link degree correlationsSpread of Information “A Measurement-driven Analysis of Information Propagation in the Flickr Social Network” [WWW’ 09]

Densely connected core • the graphs have a densely connected core comprising of between 1% and 10% of the highest degree nodes such that removing this core completely disconnects the graph. Sub logarithmic growth

Implications of densely connected core • Network contains dense core of users • Core necessary for connectivity of 90% of users • Most short paths pass through core • Could be used for quicklydisseminating information • So 10% at core • What about remaining nodes (90% at fringe)

What does the structure look like the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree nodes at the fringes of the network. octopus

Mixing time • Random walk: choose each hop randomly • Mixing time: #hops until uniform probability • Fast mixing network: mixing time = O(log n)

Sampling by random walks • A random walk has o(1) chance of escaping* • True when g bounded by o(n/log n) • Of r walks, (1-o(1))r = Ω(r) end nodes are good! • Can’t distinguish good from bad nodes in set Honest region Sybil region escaping paths non-escaping path

Creating Social Link Is Hard

Social links maintained over Internet

Social network …

Social network Honest region Sybil region Attack edges … • A malicious user fools an honest user • Creates an attack edge

Sybil resilience & group attachment theory • Sybil schemes find bond groups around a trusted node • But, these are only a fraction of all honest nodes • Bond groups are hard for Sybils to infiltrate • Not the case with identity groups

Yu, Kaminsky, Gibbons, Flaxman, Sigcomm 2006 SybilGuard

Problem Formulation and Objective • Social network • n honest human users • 1+ malicious users : multiple sybil identities • SybilGuard enables an honest node to identify other nodes • Verifier node V can verify if suspect node S is malicious

SybilGuard • Guaranteed bound on number of sybil groups • Divides n nodes into m equivalence classes • A group is sybil if it contains 1+ sybil nodes • Guaranteed bound on size of sybil groups • In a group, at most w sybil nodes • Completely decentralized • An honest node accepts honest nodes with high probability • Rejects malicious nodes with high probability • Accepts bounded number of sybil nodes

Random Routes • Foundation of SybilGuard: different from random walk • Random route begins at a random edge of a node • At every node • For an incoming edge i, there is a unique outgoing edge j • Thus, input to output is one-to-one mapped • A node A with d neighbors uniformly randomly chooses a permutation “x1,x2, . . . ,xd” among all permutations of 1,2, . . . ,d. • If a random route comes from the ith edge, A uses edge xi as the next hop.

SybilGuard Algorithm • Attack Model • n honest users: One identity/node each • Malicious users: Multiple identities each (sybil nodes) • node A: verify node B • A computes d random routes (length w) • B computes d random routes (length w) • If d/2 random routes intersects, accept S • Else reject S • If few attack edges, then a sybil node’s random route is less likely to reach honest region • And vice-versa

Main Assumptions of SybilGuard Attack edges Honest Nodes Sybil Nodes

Properties of Random Routes • Convergence • Once two routes merge, they will remain merged • Routes are back-traceable • There can be only one route with length w that traverses e along the given direction at its ith hop • If two random routes ever share an edge in the same direction, then one of them must start in the middle of the other • Cycles can exist, but with low probability • Prob. (diameter k cycle) = 1/d(k-2)

Sybilguard Algorithm Step 1: Bootstrap the network. All users exchange signed keys. Key exchange implies that both parties are human and trustworthy. B Steps: 2 Choose a verifier (A) and a suspect (B). A and B send out random walks of a certain length (2). Look for intersections. A knows B is not a Sybil because multiple paths intersect and they do so at different nodes. A 32

SybilGuard Algorithm, cont. B A 33

SybilGuard Caveats • Bootstrapping requires human interaction. • Assumes short random walks lie mostly in the honest region • Results in poor threshold to colluding attackers. • In a million node network ,each attack edge accepts nearly 2000 sybil nodes. • In million node network , SybilGuard cannot bound the number of sybils at all if there are > 15,000 attack edges .

SybilLimitA Near-Optimal Social Network Defense Against Sybil Attacks

SybilLimitA Near-Optimal Social Network Defense Against Sybil Attacks • Motivation : To mitigate the problems of SybilGuard. • Basic insight : Social network (same as SybilGuard) • SybilLimit Novelity : 1. use many random routes but shorter ones. 2. intersect edges not nodes 3. limit how often each edge is used.

Identity Registration • Each node (honest or sybil) has a locally generated public/private key pair • “Identity”: V accepts S means V accepts S’s public key KS • NO assumption/need PKI • Every suspect S “registers” KS on some other nodes

K K K K K K K K K K Registration Goals K: registered keys of sybil nodes • Ensure that sybil nodes (collectively) register only on limited number of honest nodes • Still provide enough “registration opportunities” for honest nodes K: registered keys of honest nodes K K K K K K honest region sybil region

Acceptance Criteria K: registered keys of sybil nodes • Accept S only if KS is register on sufficiently many honest nodes K: registered keys of honest nodes K K K K K K K K K K K K K K K K honest region sybil region

K K K K K K K K K K K K K K K K honest region sybil region Key Idea • Take random “walks” of w= hops • Honest nodes: likely to remain in honest region* • Sybil nodes: must cross an attack edge to reach honest region • Register key at last hop of “walk”

AB 1. request S’s set of tails 2. I have three tails AB; CD; EF 4. Is KS registered? EF CD F 5. Yes. Verification Procedure S V 3.common tail: EF 4 messages involved V accepts S Tails intersect + key registered

Sybil nodes accepted between unbounded and unbounded unbounded

SybilInfer: How to Win the Zombie Wars! Prateek Mittal, George Danezis (MSRC Intern) (MSR Cambridge)

SybilInfer • Work from UIUC and Microsoft Research • A centralized algorithm • Uses the fast mixing properties of social network to design a Bayesian Classifier • Classify nodes

Formal Model • Assign probabilities of cuts being honest • Using Bayes Theorem, we have that : • Next Challenge: Model

Formal Model X X

Sybil proof DHT

Distributed Hash Table • Interface: PUT(key, value), GET(key)→value • Route to peer responsible for key GET( sip://alice@foo ) PUT( sip://alice@foo, 18.26.4.9 )

DHTs are subject to the Sybil attack • Attacker creates many pseudonyms • Disrupts routing or stabilization s t {IDt}

The Sybil attack on open DHTs Brute-force attack Clustering attack

Defending Sybil Attack in Peer2Peer Networks