370 likes | 536 Views
Availability for DHT-Based Overlay Networks with Unidirectional Routing. Jan Seedorf ( jan.seedorf_at_nw.neclab.eu ) NEC Laboratories Europe Heidelberg, Germany. Christian Muus ( christian_at_muus.de ) University of Hamburg Hamburg, Germany. Outline.
E N D
Availability for DHT-BasedOverlay Networkswith Unidirectional Routing Jan Seedorf (jan.seedorf_at_nw.neclab.eu) NEC Laboratories Europe Heidelberg, Germany Christian Muus (christian_at_muus.de) University of Hamburg Hamburg, Germany
Outline • Introduction: Distributed Hash Tables (DHTs) • DHT Security and Lookup Availability • Motivation: The Shield Problem in Chord • Extending an Unidirectional DHT • Analytical Observations • Proposed Algorithms • Simulation Results • Conclusion
Introduction: Overlay Networks • Why are P2P systems called overlay networks? Overlay-Network IP-Network
Introduction: Scope • Structured Overlay Networks • P2P networks with formal guarantees • use Distributed Hash Tables (DHTs) as underlying structure • Scope of this work • DHTs with unidirectional routing • Availability of the lookup service during attacks on DHT Routing • Our contributions • Analytical bounds • Concrete algorithms to increase lookup availability • Simulation results, demonstrating the effectiveness of the proposed algorithms
What is a Distributed Hash Table? A formally defined substrate to efficiently and consistently store data items in a P2P network • Nodes participating in the network can use the DHT for fast and reliable search requests • Given a key, the DHT returns the IP-address of the node responsible for the key • Usually, a predefined hash function is used to map nodes and keys onto an ID • Examples: Chord, Can, Pastry, Tapestry, Kademlia →DHT: Lookup(key) -> IP-address(node_resp[key])
An unidirectional DHT: Chord • Chord • Uses a predefined Hash-Function h() with output of m bits • Nodes compute their NodeID by hashing their IP-address:node_ID(n)=h(IP-address(n)) • KeyID for a key belonging to some content is computed by hashing the key:key_ID(key)=h(key) • Lookup(k) = IP-address(rootk) Node responsible for k
Routing in Chord • Routing Structure • A virtual ring (of size 2m) is used for routing of messages • Successor: next node on the circle • Predecessor: previous node on the circle • Every node in the ring is responsible for storing the content of keys with ID between its predecessor and itself • Routing table Tr • m succeeding nodes in the ring Unidirectional routing • the closest node preceeding the key-ID is selected greedy routing • Tr(j): first node on circle that succeeds n+2j-1 mod 2m (1<=j<=m) • Nodes in routing table are at increasing distance from n • Any node can at most route half way around the circle (roughly 2m-1away from n) • Routing table Ts • List of s direct successors in the ring Unidirectional greedy routing
Routing in Chord Lookup(57)
24 Recursive Routing 26 0 27 215 202 (3) 212 210 31 192 64 161 (1) 55 (2) 65 88 159 89 128 128 137
24 Iterative Routing 26 0 27 215 (5) (4) 202 212 210 (3) 31 192 64 161 55 (2) 65 (1) 88 159 89 128 128 137
Outline • Introduction: Distributed Hash Tables (DHTs) • DHT Security and Lookup Availability • Motivation: The Shield Problem in Chord • Extending an Unidirectional DHT • Analytical Observations • Proposed Algorithms • Simulation Results • Conclusion
DHT Security • No central authority in the network • Authentication is difficult • Adversary nodes can: • Spoof identity, falsify messages in the overlay, ... • Sybil Attack1 • Without a trusted agency which certifies identities, adversary nodes can control a large fraction of an overlay network • Security Requirements2 • Secure Node-ID assignment • Secure routing table maintenance • Secure message forwarding 1 - Douceur: „The Sybil Attack“, IPTPS 2002 2 –Castro et al.: „Secure routing for structured overlay networks“, Usenix 2002 Focus of our work
Lookup Availibility • Availability of the Lookup Service • the probability that the corresponding data item is returned by the DHT after a node has invoked an arbitrary lookup • Metric: Success rate • the probability that an arbitrary lookup will succeednq: query node k: key-ID • A lookup can consist of several paths • Path: Any set of nodes such that routing from query node for key k will pass through these nodes including rootk • Alternate vs. Independent paths
Attacker Model and Assumptions • Attacker Model • Colluding attacker nodes • f - fraction of adversary nodes in the network • Attacker nodes forward messages solely to attacker nodes • IP-layer attacks out of scope • Assumptions • Secure Node-ID assignment • Adversary nodes are distributed uniformly over the ID-space • Prevention against routing table poisoning • Integrity of data stored in the DHT can be verified by query nodes • Cryptographically signed data • Problem to solve • Secure message forwarding to achieve lookup availability
Outline • Introduction: Distributed Hash Tables (DHTs) • DHT Security and Lookup Availability • Motivation: The Shield Problem in Chord • Extending an Unidirectional DHT • Analytical Observations • Proposed Algorithms • Simulation Results • Conclusion
The Shield Problem in Chord Shield(57) Lookup(57) Routing Table Node 55
The Shield Problem in Chord • Query node does not have knowledge which node stores the desired content, only the shield has this knowledge • Every lookup has to pass the predecessor of rootk • There is only one independent path for every lookup • If shieldk is an adversary node, no lookup for k can succeed
Shield Problem: Analytical Analysis • Upper bound for lookup success in Chord3: • This bound is lower than for DHTs in general4: DHTs Chord 3 – Seedorf, Muus: Availability for Structured Overlay Networks: Considerations for Simulation and a new Bound on Lookup Success, NordSec 2007 4 – Srivatsa, Liu: Vulnerabilities and Security Threats in Structured Overlay Networks: A Quantitative Analysis, ACSAC 2004
Shield Problem: Observations The shield problem is a result of unidirectional greedy routing • Every lookup has to pass the direct predeccessor (shieldk) of the node responsible for storing the key (rootk) • Lookup availability is specific to the DHT-protocol used • Regular Chord routing yields a success rate worse than DHTs in general • Secure routing techniques for multidirectional DHTs (e.g., Pastry) are not applicable Can we circumvent the shield problem while keeping an unidirectional routing structure? • Unidirectional routing has the advantage of caching • Chord: very popular DHT, many implementations and formal results • Goal: Enhance Chord • Keeping formal properties • Develop security techniques that are compatible with nodes not supporting these techniques
Outline • Introduction: Distributed Hash Tables (DHTs) • DHT Security and Lookup Availability • Motivation: The Shield Problem in Chord • Extending an Unidirectional DHT • Analytical Observations • Proposed Algorithms • Simulation Results • Conclusion
Analytical Observations P(root is adversary) P(shield is adversary) • Success rate in regular Chord: • We need to enhance the protocol so that • There are multiple independent paths for each lookup • More than one node can be the destination of a lookup • Then lookup success would be bound by the following equation: only one node responsible for storing the content for a key f=0,8 P(l_success) < 0,22 = 0,04 only one independent path from query node to root f=0,8; ind=rep=8 P(l_success) < (1-0,88)*(1-0,88)= 0,69
Extensions to Chord: Design Decisions • Use Iterative Routing • Gives the query node the option to decide on the next hop • Complete-Knowledge Routing • At each hop, all information this hop node has (i.e., its routing tables Tr and Ts) are returned to the query node • This only increases the size of messages • Nodes do not need to store more information than in regular Chord • Compatible with nodes not supporting this • Local vs. Global extensions • All other techniques introduced are computed solely at the query node • The query node does not trust nor depend on any other node in the network for these techniques • Multiple Routing paths • Two variations explored: backtracking / independent restart
Multiple Independent Paths • Regular Chord • Only one independent path between query node and rootk • Every lookup path has to pass the shield node • Direct successor list Tsis only used for redundancy • Idea: Use direct successor list Ts in routing • Query node nq uses a temporary memory list Tm ,keeping track of all nodes visited so far in the lookup • At each iterative routing hop nj, the node closest to the key from Tr(nj) that is not in Tm(nq)is selected as the next routing hop unidirectional greedy routing • If all nodes in Tr(nj) are in Tm(nq), the node closest to the key from Ts(nj) that is not in Tm(nq)is chosen, including rootk
Multiple Independent Paths • Using direct successor list Ts(nj) • In combination with complete-knowledge routing: • As soon as a node in Ts(nj)received at hop j is directly succeeding k in the ring, it must be rootk (if j is non-adversary) and the query node routes to this node • Enables the query node to gain knowledge on which node is storing the desired content from other nodes than the shield • Stretches the set of potential nodes in the penultimate routing hop to s=size[Ts] nodes • Can achieve up to s=size[Ts] independent paths • At penultimate hop at maximum s independent paths are converging • Adversary nodes • Return successor lists (Ts) with only adversary nodes • As long as one of the s=size[Ts] nodes directly preceeding rootk in the virtual routing ring is non-adversary, rootk can be reached
Multiple Independent Paths Query node k rootk Lookup(k) Shieldk(adversary) Ts(x); s=3 x Returns list of direct successors Ts(x)to query node Using direct successor list gives query node knowledge on root without using the shield
Direct Replica Routing • Regular Chord • Content for a key k is stored at rreplica roots directly succeeding rootk in the virtual ring (only used for redundancy) • Idea: • Directly (without passing rootk) route to the replica roots • The query node can determine if a node received in Ts(nj) at hop nj is a replica root • Ts(nj) contains direct successors in the ring • If a node in Ts(nj) has an ID larger than the key k it can be replica root • Only the r nodes directly succeeding k are replica roots
Multipath Replica Routing • Combining direct replica routing with independent multipath routing • MRR – Multipath Replica Routing • effectively results in s shield nodes and r (replica) root nodes for every key • as long as one of these shield nodes and one of these replica roots is non-adversary lookups can succeed ! Adversary nodes Non-adversary nodes
Detecting Node-ID Supression Attacks • Node ID supression attacks • In our model, attacker nodes route exclusively to attacker nodes • They suppress existing good nodes in routing tables they return (Tr, Ts) • Idea: • If attacker nodes are distributed uniformly in the node-ID space, the average distance between nodes in Ts(adversary_node)should be higher than in Ts(query_node) • Density checks if thedensity is higher than a threshold the hop node is considered to be adversary and ignored immediately
Simulation Results Analytical upper bound m=32
Simulation Results MRR−rwith th= ∞compared to analytical upper bound (N = 1000/2000)
Simulation Results • Simulations with no hop threshold • Analytical bound can be closely reached even though MRR cannot guarantee s independent paths • With large f, hop count gets extremely high: • e.g., f = 0.7, N = 2000 success rate of 92% but average of 635 hops per lookup Further simulations necessary with hop threshold
Simulation Results Success rate for MRR compared to regular Chord and upper bound (th=50, N=4000)
Simulation Results • Simulations with hop threshold • independent restart performs better than backtracking for attacker rates up to f = 0.6 • Density checks on every hop significantly increase lookup availability • higher threshold tdis better suited for low attacker rates whereas a lower threshold results in better performance for high attacker rates with higher attackers rates query node has already many adversary nodes in its table of direct successors Ts(nq) • Density checks also decrease the average hop count • MRR-r, th=100, f=0.6: • Without density checks: ρ = 0.49; Χ = 74.1 • With density checks, td=2.5: ρ = 0.61; Χ = 68.1 • With density checks, td=1.5: ρ = 0.62; Χ = 59.8 Tradeoff between success rate, hop threshold, and average hop count
Outline • Introduction: Distributed Hash Tables (DHTs) • DHT Security and Lookup Availability • Motivation: The Shield Problem in Chord • Extending an Unidirectional DHT • Analytical Observations • Proposed Algorithms • Simulation Results • Conclusion
Conclusion • Unidirectional DHT routing has implications on security • Shield problem in Chord: • DHT Security is specific to the DHT-protocol used • Based on analytical observations, a DHT with unidirectional routing (Chord) has been extended • Minimal changes, keeping formal properties of the DHT • Three techniques: • Unidirectional Multipath Routing • Direct Replica Routing • Node-ID supression detection with density checks • Proposed algorithms have been simulated • Significantly increase the success rate for lookups • Come very close to analytical bounds
Outlook • More Simulations • investigate the tradeoff between • Hop threshold • Average hop count • Success rate • Simulate larger networks • ~10.000 nodes • Investigate other undirectional DHTs • Kademlia Generalise results
Contact Details Jan Seedorf, Research Scientist (jan.seedorf_at_nw.neclab.eu) NEC Laboratories Europe (NLE), Network DivisionNEC Europe Ltd., Heidelberg, Germany