1 / 26

Introduction (26/10/05)

Introduction (26/10/05). P2P system in general (basics) Distributed systems Nodes – usually at the edge of the network Some characteristics: Churn (dynamic join and leave) No central control – all nodes are equal Issues of trust, incentives for sharing (selfish behavior).

mdesjardins
Download Presentation

Introduction (26/10/05)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction (26/10/05)

  2. P2P system in general (basics) • Distributed systems • Nodes – usually at the edge of the network • Some characteristics: • Churn (dynamic join and leave) • No central control – all nodes are equal • Issues of trust, incentives for sharing (selfish behavior)

  3. Nodes joining the p2p system connect (know about) a subset of other nodes, this leads to an overlay network This overlay network is built on top of the physical network We shall first see ways of building and maintaining the overlay and routing queries (first, resource location by keyword – a query includes just the keyword) Issue 1: What is the topology: how are nodes connected to each other What is the structure of the overlay (graph of nodes) ring, grid, balanced tree or unstructured (randomly connect with some nodes) or hybrid (super-peers) Issue 2: How to build and maintain such structure (under high churn, failures, etc)

  4. Nodes participating in the p2p share resources (data, computational resources, etc) Beyond resource sharing: publish/subscribe or dissemination systems In general, STRUCTURED P2P – map data (metadata/index) to nodes Issue 2: Choose an identifier (address) space (e.g., binary numbers up to 2m) Issue 3: Map peers and resources to this identifier space Issue 4: How peers manage this identifier space: assign resources to peers For example, Hash the IP of a node – produce a pid Hash the key of a resource – produce a rid Peer with pid handles all resources with a specific range of rids Important problem here: Load balancing Issue 5 : Maintain this assignment (under high churn, failures, etc)

  5. Issue 6: Routing A query is posed at a node in the overlay, route the query through the overlay towards the node that has the rid or the index entry for the rid That is find the peer that handles the rid How? Assume I know one peer, this peer should direct me to the peer with pid (if it exists), how to distribute a routing table that would lead me there For example: All peers can be connected with just two other peers, say, in a ring Routing table at each peer of size 2 (just the two neighbors) Can move towards any other peer, but O(N) What if I could “jump”? Faster, but the size of the routing table increases (along the cost of building and maintaining it)

  6. We shall start with structured overlays and then move to unstructured ones: • Four examples: • CAN • CHORD • Skip-Net • Baton

  7. BATON (2/11/05)

  8. The BATON Structure The overlay network in BATON is a binary balancedtree structure A tree is balanced if and only ifat any node in the tree, the height of its two subtreesdiffer by at most one. It has been shown that a binary balanced tree with Nnodes has height no greater than 1.44logN

  9. The BATON Structure Associate with each node in the tree a level anda number. Level The level of the root is 0, its immediatechildren are at level 1, and soon. The level of anynode is one greater than the level of its parent. The maximum level number in the tree is one less than the height of the tree. Number At level L there are at most 2Lnodes in a binary tree. Number these 2L positionsfrom left to right, from 1 until 2L, within each level,whether or not there is a node currently instantiatedat that position. The level and number togetherprecisely determine the location of a node in the binarytree. NOTE: We can use these to determinestructural relationships, if any, between a given pair ofnodes - not just parent-child and ancestor-descendantrelationships, but also siblings, neighbors, and so forth.

  10. The BATON Structure Also a linear ordering ofthe nodes in the tree, for this, anin-order traversal. Given a node x, we say that thenode immediately prior to it in thetraversal is leftadjacentto it, and the node immediately after x isright adjacentto it. Adjacent nodes maybe at very different levels In fact, in a complete tree,every alternate node in the traversal is a leaf node, andevery other alternate node is an interior node Evenwhen the tree is not complete, it is easy to show that: each interior node must have at least one adjacent nodethat is either a leaf node or an interior node with lessthan two children

  11. The BATON Structure Each node in the tree typically maps to exactly onepeer node Each node hasa logical idin terms of its level and number, and a physical id interms of its IP address.

  12. The BATON Structure • Each node in the tree maintains links to its • parent, • children, • adjacent nodes, and • selected neighbornodes which are nodes at the same level. Links to parent, children and adjacent nodes means the physical id of • the parent, • the left child, of the right child, • the left adjacent node, of the right adjacent node, if any.

  13. The BATON Structure • Linksto selected neighbors are maintained by means of twospecial sideways routing tables: • a left routing table and • a right routing table • Each routing tables contains links to nodes at the same level with numbersthat are less (respectively greater) than the numberof the source node by a power of 2 (Jumps to siblings at the left and right) • The jth elementin the left (right) routing table at node numbered Ncontains a link to the node at number N -2j-1 (respectively N + 2j-1) at the same level in the tree. • Ifthere is no such node, an entry is still made but marked as null • A routing table isconsidered full if all valid links are not null.

  14. The BATON Structure For example, consider node h. Its left routing tablehas no valid links, Its right routing table containsneighbor links to node i, j, and l Some similarity with CHORD, but: in a straight line, entries may be null and additional information (to be discussed later)

  15. The BATON Structure Theorem 1: The tree is a balanced tree if everynode in thetree that has a child also has both its leftand right routing tables full. Theorem 2: If a node, say x, contains a link toanother node, say y, in itsleft or right routing tables,the parent node of x must also contain a link to theparent node of y unless the same node is parent of bothx and y. There are at most L entries in aleft (right) routing table at level L. The total number of entries is O(logN) - the same asymptoticbound as for Chord, though in the worst case number of entries could be twice as many as in BATON, andeach entry also is larger

  16. Node Join • A node wanting to join the network must know at leastone node inside the network and sends a JOIN requestto that node. • Two phases • determine where thenew node should join • actually include it in the network at a specified place

  17. Node Join When a node receives a JOIN request, If it has bothits left routing table and its right routing table fullwhile it has less than two children, it can accept thenew node as its child. Otherwise, it needs to forwardthe JOIN request to other nodes

  18. Node Join For example, assume that node u wants to join thenetwork and it sends a JOIN request to node b b then forwards the request to p, which is itsadjacent node. As p's routing tables are not full, it forwards the request to its parent j. In turn, j checks itsrouting tables and forwards the request to the neighbornode n, which doesn't have enough children. Finally,n accepts u as its child.

  19. Node Join (Complexity) • Suppose that an adjacentlink is traversed to a leaf node w. • Either w • is able toaccept the new node as a child, or • has an incompleteneighbor table. • In the latter case, • wforwards the request to its parent, which can locate in its neighbortable a node vthat is the parent of a missing neighbor of w. • Node vcan now accept the new node asa child, unless its own neighbor table is not full, inwhich case it forwards the request toits parent. • Sincethe height of the tree is O(logN), the request cannotbeforwarded up in this manner more than O(logN)times.

  20. Node Join (Load) The algorithm seeks out leaf nodes and parents of nodes with incomplete neighbor tables,which must all be leaf nodes due to Theorem 1. Ancestor nodes are never required, and there isno involvement of the root other than just as an ordinary node. As such, we expect that the load is notdisproportionately applied to the root.

  21. Node Join (Phase 2: the position is found, insert the node) When a node x accepts the new node y as its child,it splits half of its content to its child. In addition, if y is acceptedas x's left child, x also sends its left adjacent link,which points to z, to y, and updates its left adjacentlink with y. y then creates its left adjacent link pointing to z and its right adjacent link pointing to x, andalso notifies z that z should update its right adjacentnode with y instead of x as before. Similarly, if y isaccepted as x's right child, x's right adjacent link istransferred to y. Finally, node x contacts all neighbornodes in its left and right (sideways) routing tables, asking them to inform their relevant children abouty, and in turn responding with information regardingtheir relevant children that y will require. O(log N)

  22. Node Departure Only leaf nodes may voluntarily leave the network, and only if their departure will not upset the tree balance. In other cases, a node that wishes to leave the network must find a replacement for itself, which will be a leaf node whose absence does not affect the tree balance.

  23. Node Departure If a leaf nodex wishes to leave the network, and there is no neighbor node in its routing tables with children, it can leave the network without affecting the tree balance because the requirement in Theorem 1 is still satisfied at its neighbor nodes. In this case, x has to transfer all its content, and its range of index values it is in charge of to its parent, its left adjacent link if it is a left child or its right adjacent link if it is a right child, and send LEAVE messages to its neighbor nodes to update their routing tables. The parent node of x after receiving the content from x also needs to send messages to neighbor nodes to notify them of its new content and children. It also notifies affected adjacent link node to update the corresponding adjacent link with itself. O(log N)

  24. Node Departure If a leaf node wishes to leave the network, and there are neighbor nodes in its routing tables with children, it needs to find a node to replace it by sending a FINDREPLACEMENT request to a child node of one of its neighbor nodes. If a non-leaf node wishes to leave the network, it finds a node to replace it by sending a FINDREPLACEMENT request to one of its adjacent nodes, which is a leaf node, or as deep as possible. Such messages only move down O(log N)

  25. Assigning Data A range of values is assigned to each node (both leaf and internal) For each link to a node (that is, each neighbor), record the range of values managed by that node Binary tree – actually variation of a B-tree (difference from B+?) The range of vales managed by a node must be (larger) to the right of the range managed by its left subtree and (less) to the left of the range managed by its right subtree

  26. More next time

More Related