410 likes | 550 Views
P2P Databases. P2P Today. edonkey. bittorrent. pastry. jxta. can. fiorana. napster. freenet. united devices. open cola. ?. aim. ocean store. netmeeting. farsite. gnutella. icq. ebay. morpheus. limewire. seti@home. bearshare. uddi. grove. jabber. popular power. kazaa.
E N D
P2P Today edonkey bittorrent pastry jxta can fiorana napster freenet united devices open cola ? aim ocean store netmeeting farsite gnutella icq ebay morpheus limewire seti@home bearshare uddi grove jabber popular power kazaa folding@home tapestry mojo nation process tree chord
Object representation and storage Objects Attributes : Name , Artist, Album , Genre Pointer to object
P2P vs. Distributed DBMS • Transactions • Distributed Query Optimization • Interoperation of heterogeneous data sources • Reliability/failure of nodes Traditional DDBMS Issues: Complex features do not scale
P2P vs. Distributed DBMS Example application: file-sharing • Simple data model and query language • No complex query optimization • Easy interoperation • No guarantee on quality of results • Individual site availability unimportant • Local updates • No transactions • Network partitions OK Simple Amenable to large-scale network of PCs
Example: file sharing ? ? ? ? • Challenge #1: Performance • Asking everyone is expensive! • If I am smart, I only need to ask one peer • How can I be smart? File X?
Search in P2P Both are useful to study • System can control: • Connections made by users/topology • Data placement • Query type • Tight control: “Structured” • Efficient, comprehensive • Loose control: “Unstructured” • Inefficient, not comprehensive, simple, expressive • Used in real life
Centralized • Napster model • Benefits: • Efficient search • Limited bandwidth usage • No per-node state • Drawbacks: • Central point of failure • Limited scale Bob Alice Judy Jane
Unstructured – Query Flooding = query source = forward query = processed query = found result = forward response
Problems with unstructured Structured systems address these problems • Inefficient • Query messages are flooded • Even if routing is intelligent, worst case load is still O(n), where n is # nodes in system • Not comprehensive • If I do not get a result for my query, is it because none exists? • (Of course, many optimizations are possible…)
Distributed Hash Table (DHTs) • Model: • Key/Object pair, the key is hashed to get an ID • Example: • Objects are files • The key is the content of the file • The ID is the hash of the file contents • Single operation: Lookup(ID) • Input: integer ID • Output: the object with the corresponding ID
Identifiers • IDs are m-bit integers • Nodes are also assigned IDs • Commonly assigned by hashing a node’s IP address, although many problems with this • An object is stored on the node with the smallest ID greater than the object’s ID • This node is called the successor of the object’s ID • IDs are arranged on a circle, so 0 > 2m-1
Data Placement 6 2 0 • Nodes: • 0 • 1 • 3 m = 3 7 1 1 6 6 • Data: • 1 • 2 • 6 2 2 3 5 4
Connections • Distance • 20 • 21 • …. • 2m-1 0 7 1 “Finger pointers” 6 2 3 5 4
Query • Lookup(objectID) • objectID is typically the ID of the object you are looking for, but not necessarily • Approach: • Find the predecessor of the object • I.e. the node with the largest ID that is smaller than the object ID • Return the successor of the predecessor
Query Example • Say node 0 wants to find the object with ID = 7 • For simplicity, we will assume a node exists at every ID in the space
Query Example 0 Node 0: Lookup(7) 7 1 Node 0: FindPred (7) 6 2 3 5 4
Query Example 0 Node 4: FindPred(7) 7 1 6 2 3 5 4
Query Example 0 Node 6: FindPred(7) 7 1 Node 6 is predecessor Return successor node 7 6 2 3 5 4
Query characteristics • With high probability, a query can be answered by contacting O(log N) nodes • N total nodes in the network • Efficient! • Also notice: if an object with the ID exists in the network, it will be found • Comprehensive! • State is also O(log N) in size
Query characteristics • Note that finger pointers are not required for correct operation • Only successor pointers are needed • But then cost of query increases • O(N) in worst case
Advantages of Structured? • Scalability/Efficiency • load grows with O(log N) • Comprehensiveness
Disadvantages? (cont) • Availability of Data • If a node dies suddenly, what happens to the data it was storing? • MUST replicate data across multiple nodes • Query Language • How can we express keyword queries efficiently? • Many useful applications require different languages
Magnolia Current approach: Hash each keyword separately and store pointers at h(keyword) Seven h(some) 1100100101 Innovation h(innovation) Myths “Seven Innovation Myths” 1100100101 “Innovation” h(myths) h(title)
Prefix hashing Prefix Hashing Innovation hP(innovation) hP = m’ bit hash function 100 …………. m’ m bits Partitions network into ~ n/2m’ separatesibling groups n = nodes, m’ partitioning factorFor m’=12, n= 1 million, ~ 256 nodes will share same prefix Assumption: h is uniformly distributed
Balancing Innovation Balanced over the sibling group 100 Sibling group ID=100 All siblings in a group share the same prefix
Lookup Group Broadcast or Multicast Replies O(1) Keyword Insert Keyword hP SiblingGroup ID Random Sibling Locate a sibling node via SIFT
Advantages • Good Balancing Properties
Advantages • Low Traffic Load on nodes for popular queries • Quick Lookup • Popularity Ranking of Objects • Distributed Replication for resilience
Implementing Magnolia • Developed on top of a chord clone written in Python • If you’re going to write a peer-to-peer app, why not leverage existing modules and libraries? • Challenge: How do we implement group-based stores and queries without requiring additional network maintenance?
Chord’s Finger Table • A chord node maintains a finger table of M IP’s pointing to nodes ahead of it in the ring. • A pointer at index i is the successor of node id + (2^i-1). This lets us reach any node in the network in O(log M) hops • We use the M’ most significant bits in a node’s id to indicate it’s group. We want to reach any group in O(log M’) hops. • Do we need another table? • Nope. The last M’ entries in our finger table provide this.
Talking to Siblings • How do we propagate queries through the group? • Naïve solution: send to our predecessor and successor. • A better solution: We can send a query throughout the group by treating the sibling group as a tree.
Sibling Tree N/N’ = 16; M/M’ = 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0+2^3 0+1 8 1 1+2^2 8+1 8+2^2 1+1 2 12 9 5 2+2^1 2+1 5+1 5+2^1 9+1 9+2^1 12+1 12+2^1 10 11 7 3 6 4 13 14 Every edge can be found in the finger table! 14+2^0 15
Sibling Tree Problems • Problems: • Not every possible node will exist • Not every node will have results to report • The query maker needs to know when the search is done • But we’re okay! • Nodes can determine if a child sub-tree is dead • Even if a child node in our sibling table is of a higher ID than expected • its sub-tree contains all existing descendents of the expected id • we can predict when a child is in a sibling our ancestor’s tree
Bigger Problems • What if a pointer in our finger table fails? • We either have to find the successor to it’s id or fail to query the sub-tree • What if the lowest ID node isn’t the root of our tree? • Some of our edges won’t be in our finger table