240 likes | 376 Views
Availability of Multi-Object Operations. By Haifeng Yu, Phillip B. Gibbons, Suman Nath Presented by Lucas T. Cook September 08, 2006. The History: Availability & Distributed Objects. Availability is a primary concern in systems that offer services (‘four nines’ vs ‘five nines’)
E N D
Availability of Multi-Object Operations By Haifeng Yu, Phillip B. Gibbons, Suman Nath Presented by Lucas T. Cook September 08, 2006
The History: Availability & Distributed Objects • Availability is a primary concern in systems that offer services (‘four nines’ vs ‘five nines’) • In systems that allow querying for objects (files, data, code, images, etc), redundancy has been a primary way of offering availability despite failure. • Backup techniques like replica placement and erasure coding have been thoroughly studied in distributed systems.
The Problem: Availability in Multi-Object Requests • Yet previous research for replica placement focuses on requests for individual objects… • How does the situation change for requests that want multiple objects at once? • e.g. compilers, databases, overseers • Do results for individual queries tell us anything? • How does relative placement (assignment) of replicas in a system affect availability with unpredictable failures?
“Brief Summary” of Yu et al What to expect: • Study of multiple protocols! • Nifty and provable generalizations about these protocols! • Limited test data! • A bunch of variables to remember!
A Simple Example • 8 Machine system, 16 Objects, 2 Copies per object • Each machine may crash (all of its data unavailable) • Each machine can only hold 4 copies • How can we provide availability for queries?
A Simple Example • Well, it depends on the request! • Say that the 16 objects are ID pictures, 8 of criminals (labeled A-H below), and airline security is cross-checking your photo ID at the airport… • We need all 8 criminal images or we may miss that you’re a criminal! (this is a strict operation) • What’s the best assignment of replicas to servers?
A Simple Example • I and II look good, but which is better? • It turns out that I has better availability • But now let’s say the objects are survey data, 8 are from Illinois participants, and we’re querying data to calculate average responses from Illinois residents. • Say we only need 5 of the objects for a good calculation (this is a tolerant or loose operation)
A Simple Example • III and IV look good, but which is better? • It turns out that IV has better availability • It seems intuitive to choose appropriate degree of spread based on strict vs. tolerant operation, but what about the finer details?
Some notation • N: # objects in the system • k: # of fragments or replicas (FORs) per object • m: # of FORs needed to reconstruct an object (m = 1 for replicas, but may not for fragments) • n: # objects that an operation requests • t: # objects needed for an operation to succeed • s: # machines in the system • l: # FORs on each machine (load) • p: probability of failure of each machine • FP(a): probability of failure of an assignment (unavailability) Note: “a is x nines better than b” => log0.1FP(a) - log0.1FP(b) = x
Consistent Hashing for Assignment • Each machine gets an ID from 0 to MAXID (probably from some form of hashing). Organize the machines into a ring where the IDs are non-decreasing going clockwise (until loop) • Run an assignment algorithm to place the FORs in the ring appropriately. • Ideally, we’d like perfect load balancing: Each machine has l = kN/s replicas on it
Common Assignment Algorithms • Chord: k successors of h(o) • CAN (Multi-hash): h(o,1…k) • Pastry: k closest to h(o) • Glacier: k equidistant points from successor of h(o) • “Group DHT”: group of k successors of h(o)
Ideal Assignment Algorithms • RAND: assign perfect load l of randomly permuted replicas to each server • SW (Sliding Window): FORs for l/k objects are put on 1...k, FORs for another l/k are put on 2...k+1, etc. • PTN: partition objects into sets of l and mirror them across machines
The relation? When there are ideal conditions (perfect load balancing, evenly distributed object hashes): • Multi-hash acts like RAND • Chord and Pastry act like SW • Glacier and Group act like PTN
Performance of Ideal Assignments • Looking at n=N, check unavailability for different values of t/n • N = 24000, s = 240 k = 3, p = 0.1 • For tight queries (t/n=1), PTN is much better! • But for loose queries, RAND is much better! • SW is in the middle, in general.
Performance of Ideal Assignments Steps in PTN from clustering Crossover occurs around the availability for individual objects
Performance of Ideal Assignments • Why this kind of behavior? • Inter-object correlation: how objects are correlated by failures in the system (replicas share servers) • Analytical results from a previous paper show that: • When t=n, PTN is the best and RAND is the worst of all possible ideal assignments • When t=l+1, n=N (or t=1, n<N), RAND is the best and PTN is the worst of all possible ideal assignments • It is impossible to achieve the best of both assignments.
Performance of Practical Assignments • Yu et al look at two particular “real world” effects: • Consistent Hashing (hashing into fairly static buckets so that changes don’t force reorganization), which yields load imbalance with crashing and rebooting machines. • Failure Correlation (correlated failures instead of independent ones) • Also, they test with runs involving multiple operations (“real workloads”) • Note that for individual object queries, the assignments tend to all have very high reliability
Consistent Hashing Performance • Looking at n=N, check unavailability for different values of t/n • N = 24000, s = 240 k = 3, p = 0.05 • Group is close to PTN and Multi-hash is farther from RAND. • Suggests imperfect load balancing increases inter-object correlation
Consistent Hashing Performance • N = 24000, s = 240 k = 3, p = 0.05 • n is only 600 this time • Note that the order-preserving hash function is needed for low n Group
Failure Correlation Performance • n = N = 24000, s = 240 k = 3 • Plot has (c) correlated and (i) independent failures • p = 0.0215 for indep. (chosen to mimic individual object failure: pk = 10-5) • Curves move towards Group (decrease slope) • Inter-object correlation!
Real Workload Availability • s = 240, k = 3 • Left plot is from a TPC-H benchmark with 22 different queries • Right plot is from an IrisLog log, with 6,467 queries for 3530 objects with an average of 704 obj/query • t=n: Group outperforms Multi-hash by almost four nines! • t/n = 90%: Multi-hash outperforms Group by more than two nines!
Summary • Individual operations have high availability across the popular replication schemes; multiple object operations do not. (varies from <50% to multiple nines for the same scheme!) • Identified the best and worst assignments for strict and tolerant operations • Real environment constraints change inter-object correlation, but not relative ordering (actually consistent hashing and failure correlation hurt loose ops and help strict ops)
Room for more work? • Different objects (Multimedia databases!) • Varied k • Data for erasure coding • FOR Repair • Arguments are unclear from graphs?