470 likes | 837 Views
CSE 548 Advanced Computer Network Security – Content Overlay. Dijiang Huang Arizona State University, Fall 2007. Outline. Structured Overlay Chord Security Issues. Chord: Overview. What is Chord? [Stoica et. al.] A scalable, distributed “lookup service”
E N D
CSE 548 Advanced Computer Network Security – Content Overlay Dijiang Huang Arizona State University, Fall 2007
Outline • Structured Overlay • Chord • Security Issues
Chord: Overview • What is Chord? [Stoica et. al.] • A scalable, distributed “lookup service” • Lookup service:A service that maps keys to values (e.g., DNS, directory services, etc.) • Key technology:Consistent hashing • Major benefits of Chord over other lookup services • Simplicity • Provable correctness • Provable “performance”
…. node node node What is DHT? Distributed application data get (key) put(key, data) Distributed hash table DHT provides the information look up service for P2P applications. • Nodes uniformly distributed across key space • Nodes form an overlay network • Nodes maintain list of neighbors in routing table • Decoupled from physical network topology (Figure adopted from Frans Kaashoek)
Applications • Anything that requires a hash table • Databases, FSes, storage, archival • Web serving, caching • Content distribution • Query & indexing • Naming systems • Communication primitives • Chat services • Application-layer multi-casting • Event notification services • Publish/subscribe systems ?
N1 N2 N3 N5 N4 Chord: Primary Motivation Scalable location of data in a large distributed system Publisher Key=“LetItBe” Value=MP3 data Client Lookup(“LetItBe”) Key Problem: Lookup
Chord: Design Goals • Load balance: Chord acts as a distributed hash function, spreading keys evenly over the nodes. • Decentralization: Chord is fully distributed: no node is more important than any other. • Scalability: The cost of a Chord lookup grows as the log of the number of nodes, so even very large systems are feasible. • Availability: Chord automatically adjusts its internal tables to reflect newly joined nodes as well as node failures, ensuring that, the node responsible for a key can always be found. • Flexible naming: Chord places no constraints on the structure of the keys it looks up.
Consistent Hashing • Uniform Hash: assigns values to “buckets” • e.g., H(key) = f(key) mod k, where k is number of nodes • Achieves load balance if keys are randomly distributed • Problems with uniform hashing • How to perform consistent hashing in a distributed fashion? • What happens when nodes join and leave? Consistent hashing addresses these problems
Consistent Hashing • Main idea: map both keys and nodes (node IPs) to the same (metric) ID space Ring is one option. Any metric space will do Initially proposed for relieving Web cache hotspots [Karger97, STOC]
Consistent Hashing • The consistent hash function assigns each node and key an m-bit identifier using SHA-1 as a base hash function • Node identifier: SHA-1 hash of IP address • Key identifier: SHA-1 hash of key
SHA-1 ID=60 Key=“LetItBe” SHA-1 ID=123 IP=“198.10.10.1” Chord Identifiers • m bit identifier space for both keys and nodes • Key identifier: SHA-1(key) • Node identifier: SHA-1(IP address) • Both are uniformly distributed • How to map key IDs to node IDs?
Consistent Hashing in Chord A key is stored at its successor: node with next higher ID 0 K5 IP=“198.10.10.1” N123 K20 Circular 7-bit ID space K101 N32 Key=“LetItBe” N90 K60
Consistent Hashing Properties • Load balance: all nodes receive roughly the same number of keys • Flexibility: when a node joins (or leaves) the network, only an fraction of the keys are moved to a different location. • This solution is optimal (i.e., the minimum necessary to maintain a balanced load)
0 N10 Where is “LetItBe”? Hash(“LetItBe”) = K60 N123 N32 “N90 has K60” K60 N90 N55 Consistent Hashing • Every node knows of every other node • requires global information • Routing tables are large: O(N) • Lookups are fast: O(1)
Load Balance Results (Theory) • For N nodes and K keys, with high probability • each node holds at most (1+)K/N keys • when node N+1 joins or leaves, O(N/K) keys change hands, and only to/from node N+1
0 N10 Where is “LetItBe”? N123 Hash(“LetItBe”) = K60 N32 “N90 has K60” N55 K60 N90 Lookups in Chord • Every node knows its successor in the ring • Requires O(N) lookups
Reducing Lookups: Finger Tables • Every node knows m other nodes in the ring • Increase distance exponentially N16 N112 80 + 25 80 + 26 N96 80 + 24 80 + 23 80 + 22 80 + 21 80 + 20 N80
Reducing Lookups: Finger Tables • Finger i points to successor of n+2i N120 N16 N112 80 + 25 80 + 26 N96 80 + 24 80 + 23 80 + 22 80 + 21 80 + 20 N80
Finger Table Lookups Each node knows its immediatesuccessor. Find the predecessor of id and ask for its successor. Move forward around the ring looking for node whose successor’s ID is > id
Faster Lookups • Lookups are O(log N) hops N5 N10 N110 K19 N20 N99 N32 Lookup(K19) N80 N60
Example: How lookup works? Example: Chord [Stoica et. al.] Node 2 wants to find key 0 0 1 15 2 14 Finger Table for Node 2 3 13 4 12 5 11 10 6 7 9 8
How lookup works? Example: Chord 0 1 15 Finger Table for Node 10 2 14 3 13 4 12 5 11 10 6 7 9 8
How lookup works? Example: Chord 0 1 15 Finger Table for Node 10 2 14 3 13 4 12 5 11 10 6 7 9 8
How lookup works? Example: Chord 0 1 15 Finger Table for Node 14 2 14 3 13 4 12 5 11 10 6 7 9 8
How lookup works? Example: Chord 0 1 15 Finger Table for Node 14 2 14 3 13 4 12 5 11 10 6 7 9 8
How lookup works? Example: Chord 0 1 15 2 14 3 Now Node 2 can retrive information for key 0 from Node 1. 4 12 5 11 10 6 7 9 8
Summary of Performance Results • Efficient:O(log N) messages per lookup • Scalable:O(log N) state per node • Robust: survives massive membership changes
Joining the Ring • Three step process • Initialize all fingers of new node • Update fingers of existing nodes • Transfer keys from successor to new node • Two invariants to maintain • Each node’s successor is maintained • successor(k) is responsible for k
Join: Initialize New Node’s Finger Table • Locate any node p in the ring • Ask node p to lookup fingers of new node N5 N20 N99 N36 1. Lookup(37,38,40,…,100,164) N40 N80 N60
Join: Update Fingers of Existing Nodes • New node calls update function on existing nodes • Existing nodes recursively update fingers of other nodes N5 N20 N99 N36 N40 N80 N60
K30 K38 Join: Transfer Keys • Only keys in the range are transferred N5 N20 N99 N36 Copy keys 21..36 from N40 to N36 N40 K30 K38 N80 N60
Handling Failures • Problem: Failures could cause incorrect lookup • Solution:Fallback: keep track of successor fingers N120 N10 N113 N102 Lookup(90) N85 N80
Handling Failures • Use successor list • Each node knows r immediate successors • After failure, will know first live successor • Correct successors guarantee correct lookups • Guarantee is with some probability • Can choose r to make probability of lookup failure arbitrarily small
N1 N2 N4 N3 Start Target N10 N6 N8 N7 N9 Server N1 Client Client N2 Internet N4 N3 Start Target DB N10 Client N6 N8 Client N7 N9 Server Alternatives to DHTs • Distributed file system • Centralized lookup • P2P flooding queries (Figures adopted from Frans Kaashoek)
Outline • Structured Overlay • Chord • Security Issues
Security – Incorrect Lookup (1) • When asked for the “next hop”, give a wrong answer. • An individual malicious node could forward lookups to an incorrect or non-existent node. 0 Finger Table for Node 2 1 15 2 14 3 13 4 12 5 11 Node 2 to Node 10: Please tell me how to reach key 0 …. 10 6 7 9 8
Security – Incorrect Lookup (2) • When asked for the “next hop”, give a wrong answer 0 Finger Table for Node 10 1 15 2 14 3 13 4 12 5 11 Node 2 to Node 10: Please tell me how to reach key 0 …. Node 10 answers: ask Node 14 10 6 7 9 8
Security – Incorrect Lookup (3) • When asked for the “next hop”, give a wrong answer 0 Finger Table for Node 14 1 15 2 14 3 13 4 12 5 11 Node 2 to Node 14: Please tell me how to reach key 0 …. Node 14 answers: ask Node 10 10 6 7 9 8
Security – Incorrect Lookup (4) Solution [Sit and Morris]: • “Define verifiable system invariant” • “Allow the querier to observe lookup progress” How this can be implemented: • Concretely, using an integral monotonically decreasing quantity to implement the idea of “progress”. • The concept of “monotonically decreasing quantity” has been used in program construction guaranteeing total correctness.
Security – Inconsistent Behaviour • Inconsistent Behaviour, i.e., lie intelligibly • Sybil attack [Kaashoek] Solution 1: public key solution
Security – Inconsistent Behaviour Commander attack attack “he said ‘retreat’” Lieutenant 1 Lieutenant 2 • Inconsistent Behaviour, i.e., lie intelligibly • Sybil attack [Kaashoek] Solution 1: public key solution Solution 2: Byzantine Protocol [Lamport] Byzantine Generals Problem: How to find out the traitors among the Generals? [Lamport]
Security – Inconsistent Behaviour Commander attack retreat “he said ‘retreat’” Lieutenant 1 Lieutenant 2 • Inconsistent Behaviour, i.e., lie intelligibly • Sybil attack [Kaashoek] Solution 1: public key solution Solution 2: Byzantine Protocol Byzantine Generals Problem: How to find out the traitors among the Generals? [Lamport]
Research Projects Iris – security & fault-tolerance – US Gov’t Chord – circular key space Pastry – circular key space Tapestry – hypercube space CAN – n-dimensional key space Kelips – n-dimensional key space DDS -- middleware platform for internet service construction -- cluster-based -- incremental scalability
References • Lamport, Leslie et. al. The Byzantine Generals Problem, ACM Transactions on Programming Languages and Systems (TOPLAS) , 1982 • Sit, Emil, Morris, Robert. Security Considerations for Peer-to-Peer Distributed Hash Tables, IPTPS02 • Kaashoek, Frans. (Presentation) Distributed hash tables: Building large-scale, robust distributed applications, ACM Symp., PODC, 2002 • Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup service for Internet applications, Proceedings of the 2001 SIGCOMM conference . • David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin, and Rina Panigrahy. Consistent Hashing and Random Trees: Tools for Relieving Hot Spots on the World Wide Web. STOC 1997.