1 / 22

CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

CS 294-8 Distributed Data Structures http://www.cs.berkeley.edu/~yelick/294. Agenda. Overview Interface Issues Implementation Techniques Fault Tolerance Performance. Overview. Distributed data structures are an obvious abstraction for distributed systems. Right?

Download Presentation

CS 294-8 Distributed Data Structures cs.berkeley/~yelick/294

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 294-8Distributed Data Structureshttp://www.cs.berkeley.edu/~yelick/294

  2. Agenda • Overview • Interface Issues • Implementation Techniques • Fault Tolerance • Performance

  3. Overview • Distributed data structures are an obvious abstraction for distributed systems. Right? • What do you want to hide within one? • Data layout? • When communication is required? • # and location of replicas • Load balancing

  4. Distributed Data Structures • Most of these are containers • Two fundamentally difference kinds: • Those with integrators or ability to look at all container elements • Arrays, meshes, databases*, graphs* and trees* (sometimes) • Those with only single element ops • Queue, directory (hash table or tree), all *’d items above

  5. DDS in Ninja • Described in Gribble, Brewer, Hellerstein, Culler • A distributed data structure (DDS) is a self-managing layer for persistent data. • High availability, concurrency, consistency, durability, fault tolerance, scalability • A distributed hash table is an example • Uses two-phase commits for consistency • Partitioning for scalability

  6. Scheduling Structures • In serial code, most scheduling is done with a stack (often implicit), a FIFO queue, or a priority queue • Do all of these makes sense in a distributed setting? • Are there others?

  7. Distributed Queues • Load balancing (work stealing…) • Push new work onto a stack • Execute locally by popping from the stack • Steal remotely by removing from the bottom of the stack (FIFO)

  8. Interfaces (1) • Blocking atomic interfaces: operations happen between invocation and return • Internally each operation performs locking or other form of synchronization • Non-blocking “atomic” interfaces: operation happens sometime after invocation • Often paired with completion synchronization • Request/response for each operation • Wait for all “my” operations to complete • Wait for all operations in the world to complete

  9. Interfaces (2) • Non-atomic interface: use external synchronization • Undefined under certain kinds (or all) concurrency • May be paired with bracketing synchronization • Aquire-insert-lock, insert, insert, Release-insert-lock • Begin-transaction… • Operations with no semantics (no-ops) • Prefetch, Flush copies, … • Operations that allow for failures • Signal “failed”

  10. DDS Interfaces • Contrast: • RDBMS’s provide ACID semantics on transactions • Distributed files systems: NFS weak, Frangipani and AFS stronger • DDS: • All operations on elements are atomic (indivisible, all or nothing) • This seems to mean that the hash table operations that involve a single element are atomic • One-copy equivalence: replication of elements is invisible • No transaction across elements or operations

  11. Implementation Strategies (1) • Two simple techniques • Partitioning: • Used when the d.s. is large • Used when writes/updates are frequent • Replication: • Used when writes are infrequent and reads are very frequent • Used to tolerate failures • Full static replication is extreme; dynamic partial replication is more common • Many hybrids and variations

  12. Implementation Strategies (2) • Moving data to computation good for: • dynamic load balancing • I.e., idle processors grab work • smaller objects in ops involving > 1 object • Moving computation to data good for: • large data structures • Other?

  13. DDS: Distributed Hash Table • Operations include: • Create, Destroy • Put, Get, and Remove • Built with storage “bricks” • Each manage a single node, network-visible hash table • Contain a buffer cache, lock manager, network stubs and skeletons • Data is partitioned, and partitions are replicated • Replica groups are used for each partition

  14. DDS: Distributed Hash Table • Operations on elements: • Get – use any replica in appropriate group • Put or remove – update all replicas in group using two-phase commit • DDS library is commit coordinator • If individual node crashes during commit phase, it is removed from replica • If DDS fails during commit phase, individual nodes will coordinate: if any have committed, all must

  15. DDS: Hash Table Key: 110011 0 1 0 1 0 1 0 1 0 1 DP map RG map

  16. Example: Aleph Directory • Maps names to mobile objects • Files, locks (?), processes,… • Interested in performance at scale, not reliability • Two basic protocols: • Home: each object has a fixed “home” PE that keeps track of cache copies • Arrow: based on path-reversal idea

  17. Path Reversal Find

  18. Path Reversal

  19. Aleph Directory Performance • Aleph is implemented as Java packages on top of RMI (and UDP?) • Run on small systems (up to 16 nodes) • Assumed that “home” centralized solution would be faster at this scale • 2 messages to request; 2 to retrieve • Arrow was actually faster • Log2 p to request; 1 to retrieve • In practice, only 2 to request (counter ex.)

  20. Hybrid Directory Protocol • Essentially the same as the “home” protocol, except • Link waiting processors into a chain (across the processors) • Each keeps the id of the processor ahead of it in the chain • Under high contention, resource moves down the chain • Performance: • Faster than home and arrow on counter benchmark and some others…

  21. How Many Data Structures? • Gribble et al claim: • “We believe that given a small set of DDS types (such as a hash table, a tree, and an administrative log), authors will be able to build a large class of interesting and sophisticated servers.” • Do you believe this? • What does it imply about tools vs. libraries?

  22. Administrivia • Gautam Kar and Joe L. Hellerstein speaking Thursday • Papers online • Contact me about meeting with them • Final projects: • Send mail to schedule meeting with me • Next week: • Tuesday: guest lecture by Aaron Brown on benchmarks; related to Kar and Hellerstein work. • Still to come: Gray, Lamport, and Liskov

More Related