1 / 58

CAN extra features

CAN extra features. Resources on the Net. Google scholar: http://scholar.google.com/ DBLP: http://www.informatik.uni-trier.de/~ley/db/ DBWORLD http://www.cs.wisc.edu/dbworld/browse.html Top Conferences: Databases: SIGMOD, VLDB, ICDE EDBT, CIKM …

teigra
Download Presentation

CAN extra features

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CAN extra features

  2. Resources on the Net Google scholar: http://scholar.google.com/ DBLP: http://www.informatik.uni-trier.de/~ley/db/ DBWORLD http://www.cs.wisc.edu/dbworld/browse.html Top Conferences: Databases: SIGMOD, VLDB, ICDE EDBT, CIKM … Distributed systems: IPDPPS, ICDCS -- (network) INFOCOM … P2P: IPTPS, IEEE P2P … Many on systems (USENIX association)

  3. Δομημένα Συστήματα Ομότιμων Κόμβων • Τα δεδομένα (ή το ευρετήριο) δεν τοποθετείται σε «τυχαίους» κόμβους • Αντιστοιχία των δεδομένων (του id τους, π.χ. του ονόματος ενός αρχείου) σε ένα συγκεκριμένο κόμβο του συστήματος • id δεδομένου  id κόμβου • Πως;Με μια συνάρτηση κατακερματισμού (hash function) • Distributed Hash Tables (DHT) • Αναζήτηση σε DHT • Lookup(id-δεδομένου) μας δίνει το id του κόμβου που έχει πληροφορία για το δεδομένο • ΔΡΟΜΟΛΟΓΗΣΗ προς αυτόν τον κόμβο (βασική αρχή της δρομολόγησης: σε κάθε βήμα μειώνουμε την απόσταση προς τον κόμβο)

  4. Search • What is actually stored in the DHT? (i) (key, IP of node(s)) DHT index (ii) (key, data) DHT storage system • Some DHT are directional, u  v but not v  u

  5. Search Iterative lookups Recursive lookups Also, how is the actual data item transferred to the requestor (eg along the reverse search path or directly) privacy vs efficiency Cache and replication decisions may be affected by this

  6. Κατανεμημένες Βάσεις Δεδομένων vs ΣΟΚ (p2p) • Kατανομή των δεδομένων • Κατανεμημένες ΒΔ: οριζόντιος ή κάθετος κατατεμαχισμός σχέσεων με βάση τη συχνότητα των ερωτήσεων • Παράλληλες ΒΔ: RR, διάστημα, κατακερματισμός • Εδώ, δε γνωρίζουμε το σύνολο των κόμβων (ούτε το σχήμα των δεδομένων)

  7. Δομημένα Συστήματα Ομότιμων Κόμβων • Additional issues: Fault tolerance, load balancing, network awareness, concurrency Replicate & cache Performance evaluation • CAN • CHORD • ΒΑΤΟΝ

  8. Additional Issues Load balance • (data) each node gets a same-size partition of the data space • (workload) each node gets similar query and update load (including routing load) How to achieve this?

  9. Additional Issues Physical awareness All approaches strive to minimize overlay hops – does this lead to better response time? Nodes in the overlay should be close to each other in the physical network How can this be achieved? • during the construction of the overlay: a node chooses its neighbors based on their physical network location • during query routing: if possible, a node chooses the neighbor that is physically closer – a path that includes nearby nodes

  10. Additional Issues Physical awareness Important Note: Physical network proximity does not mean better response time, quality of the connection is more crucial, this need to be estimated consider the network level round trip time - RTT - between two nodes (again, this can change at run time)

  11. Additional Issues Failures What happens when a node fails or departs: How to improve availability/robustness/fault tolerance Two aspects: • (routing) Correct operation of the overlay: maintain the routing information • (data) Avoid losing data – that is the database of (key, value) pairs that is stored in the overlay

  12. Additional Issues Failures In general: Increase REDUNDANCY • More than one routes to data (eg, also know the successor of my successor, more than one neighbor) • replication, etc

  13. A note on caching Cache issues • At various level (besides operating system level) • centralized systems (cache in main memory) • distributed systems (cache in client main memory or disk) • hit rate, cache overhead in cache miss, • replacement policy: in general predict future references • if no other information, use the past to predict the future (LRU, LFU), • aging • prefetching,

  14. A note on caching Cache issues (continued) • cache coherency – cache items may become obsolete, most often by cache invalidation (stateful vs stateless servers) *note on push vs pull • recently cooperative caching – cache shared among a cluster of nodes

  15. A note on replication Distributed systems: create copies of an item • How many? • Where to place them • How to update them (keep them consistent) • How to make replication adaptive

  16. A note on replication Replicating (key, value) pairs vs Replicating routing information Replication for • performance (bring data closer to their requestors) • load balance • availability (fault-tolerance) Discuss: how many? where? update? for each of the above

  17. A note on replication From D. Kossmann, “The state of the art in distributed query processing”, ACM Computing Surveys, 32(4), December 2000 • Client-initiated based on local access (exploits locality of reference) – more proactive (based on more general statistics) targeting a larger number of clients • Replica locations are maintained in the catalog • Cache replacement policies (eg LRU) are usually associated with caching • Web: client cache and replication (mirroring)

  18. Additional Issues Concurrency Multiple concurrent updates and lookups Need some form of synchronization

  19. Additional Issues CAN

  20. CAN: Number of Dimensions For a system with nnodes and d dimension (for perfectly partitioned coordinate spaces): state per node: 2d path length: O(d(n1/d)) Increase dimensions (+) reduce latency (-) increase number of neighbors of each node (storage and maintenance cost) (+) improves (routing) fault tolerance, since each node has many potential next hop nodes

  21. CAN: Number of Dimensions

  22. CAN: Number of Realities Maintain multiple, independent coordinate spaces with each node Each coordinate space, called a reality r realities: each node is assigned r coordinate zones (owns one (different) zone per reality) and has r independent neighbor sets

  23. CAN: Number of Realities • Why multiple realities: • (+) Data availability (fault tolerance) • Each (key, item) pair is hashed at r different nodes • Item becomes unavailable, when all r nodes become unavailable • (+) Route availability (fault tolerance) • There is a path to the item at each reality, if the path breaks at one reality, follow a path at some other reality • (+) Reduced path length • Forward the message at all r realities in parallel • At each step, check neighbors at all realities: Forward the message to the neighbor with coordinates closest to the destination (-) maintenance cost, storage and complexity

  24. CAN: Number of Realities

  25. CAN: Multiple Realities vs Multiple Dimensions For the same space overhead, to improve query latency, better to increase dimensions But, multiple realities improve data availability through replicating data at each reality

  26. CAN: Multiple Hash Functions • Use k hash functions to map a single key (item) at k points in the coordinate space • (+) Improve data availability (fault tolerance) • A key is unavailable when all k replicas are simultaneously unavailable • (+) Reduce path length • Send a query at all k nodes in parallel or • choose to send it to the node that is closest in the coordinate space • (-) Increase size • (-) If in parallel, increase query traffic

  27. CAN: Multiple Hash Functions

  28. CAN: Physical Network Awareness Average total latency of a lookup Average number of hops * Average latency of each hop What is the best we can do: Achieve the underlying IP path latency between the requester and the CAN node holding the key Stretch = average total overlay latency/average IP latency • How to reduce the second factor: • Better CAN routing metrics • Increase choices for next hop • Topologically sensitive construction of the overlay

  29. CAN: Better Routing Metrics To route a message, each node: • Measures the network level round-trip time RTT to each of its neighbors • Forwards the message to the neighbor with the maximum ratio of progress to RTT (to avoid long hops)weighted decision (?) • Reduce the delay of each hop of the search path overall path latency per-hop latency= path length

  30. CAN: Better Routing Metrics How to evaluate this: Use simulated topologies, In the paper, Transit-Stub topologies Routing domains in the Internet either transit (includes different domains) or stub domains 100ms for Intra-transit domain links 10ms for stub-transit links 1ms for intra-stub domain links End-to-end latency between randomly selected pairs of source, destination path 115ms E. Zegura, K. Calvert, S. Bhattacharjee “How to Model an Internetwork” Infocom 1996

  31. CAN: Better Routing Metrics between 28 and 218 nodes reduceper-hop latency 24% - 40% depending on d Higher dimensions give more next-hop forwarding choices and hence better improvements

  32. CAN: Topologically Sensitive Construction • How to choose neighbors that are nearby in the IP topology? • Assumes the existence of landmarks: a well known set of m machines (say DNS servers), usually placed at random at k hops from each other – all nodes measure their distances to these nodes • Distributed binning of CAN nodes • Every node • measures its round-trip-time to each of this landmarks • orders the landmarks in order of increasing RTT (m! possible orderings) • This results in partitioning the coordinate space in m! portions

  33. CAN: Topologically Sensitive Construction Upon joining, each node selects a random point in the portion of the coordinate space associated with each landmark ordering Why this work? Topologically close nodes are likely to have similar orderings and thus reside at the same portion of the coordinate space

  34. CAN: Topologically Sensitive Construction • Transit-Stub topology • m= 4 landmarks • Two random points 5 (network) hops away from each other

  35. CAN: Topologically Sensitive Construction One problem: The m! portions may not be equally populated

  36. CAN: Zone Overloading Allow multiple nodes to share the same zone MAXPEERS: maximum number of peers allowed to share a zone (normally 3-4 peers per zone) Why? We want to be able to choose a neighbor in a zone that is closest (in latency) to forward a message If there are more nodes in the zone, we have a better chance to find one that is closest

  37. CAN: Zone Overloading Extra bookkeeping: • Peers in each node must know each other, thus each node (in addition to its neighbor lists) maintains a list (size up to MAXPEERS) of its peers (not necessarily all of them) • The (key, item) database of the zone must be shared among them as well as the routing tables (IP of neighboring nodes and their coordinates)

  38. CAN: Zone Overloading Say A joins a zone owned by B If B has fewer than MAXPEERS nodes then A joins B zones B sends its lists of neighbors to A (periodic soft state updates from A to its peers and neighbors) Else zone is split into half using a deterministic rule, the nodes in the peer list together with A divide themselves equally (periodic soft state updates from A to its peers and neighbors)

  39. CAN: Zone Overloading Periodically, • A node send its coordinate neighbor a request for its list of peers • Measure the RTT to all nodes in that neighbor’s zone • Retain the node with the lowest RTT as its neighbor in that zone

  40. CAN: Zone Overloading How is the content (key, value) pairs maintained among the peers at each zone? Replicate vs Partition (+) higher availability (-) increased size (-) data consistency overhead

  41. CAN: Zone Overloading Reduced path length effectively reduces the number of nodes in the system Increase fault tolerance a zone is vacant when all MAXPEERS crash simultaneously Reduced per-hop latency A node has multiple choices for selecting a neighbor Increased complexity

  42. CAN: Zone Overloading • Two layer architecture • an intra-zone overlay among MAXPEERS • an inter-zone CAN overlay • Discuss super-peer architectures

  43. CAN: Load balancing • data load balance: uneven partition of the key (data) space Proposed solution: uniform partition • workload balance: uneven partition of the load among the nodes (either due to the structure of the overlay or non-uniform data popularity) Proposed solution: replication and caching

  44. CAN: Uniform Partitioning • We would like to have a uniform partition of the coordinate space • Why? since a uniform hash function, the volume of a zone is indicative of the size of the (key, value) pairs it stores • If all data space, VT and n nodes • Perfect partition: VT/n of the data space • Simple heuristic • The node that owns a zone • Does not split the zone • Compares its volume with those of its immediate neighbors • Chooses and splits the one with maximum volume

  45. CAN: Uniform Partitioning V = VT/n Ideally all nodes should get V

  46. CAN: Cache and Replication • Cache • Each node maintains a cache of data keys that it recently accessed • When it receives a request for a key, first checks its cache, if the key is in its cache (cache hit), it saves the lookup time, else it forwards the request It reduces latency, but why load balance? Also possible to do path replication

  47. CAN: Cache and Replication • Replication • A node that becomes overloaded, replicates its content to its neighbors • Note, the owner pushes out an item, while with caching the requestor that has accessed it maintains a copy of it locally • Upon receiving a request for a key, the node chooses with a certain probability either to serve it or forward it on its way • An item is eventually replicated within a region surrounding the original storage node – why around the owner?

  48. Evaluation • Why/what to evaluate? • Proof-of-concept (it can be done) • Performance (evaluate a number of metrics (load balance, response time, fault-tolerance, etc) • Tune, a lot of “magic” constants, eg number of dimensions, etc – self-organization • Show the benefits of specific optimizations/compare alternative implementations

  49. Evaluation How to evaluate Build the system (small scale up to 500 nodes (Planetlab)) Simulate the system (allows for large-scale experiments, flexibility in setting the various parameters) Run it on a real platform, or a test-bed Emulation (combine simulation + actual implementation)

  50. CAN Metrics for System Performance • Path length:overlay hops to route between two nodes in the CAN space • Latency: • End-to-end latency between two nodes • Per-hop latency: end-to-end latency/path length • Neighbor-state: # nodes a CAN node needs to maintain state • Volume per node (indicative of data and query load lookup

More Related