1 / 55

Distributed Transactional Memory

Distributed Transactional Memory. Presented by Gala Yadgar. Model. A network of nodes Transactions are immobile Objects move from node to node. Model. D. Cache coherence protocol Locate the current copy Move and invalidate Metric Location aware. C. A. B. Outline.

allene
Download Presentation

Distributed Transactional Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Transactional Memory Presented by Gala Yadgar

  2. Model • A network of nodes • Transactions are immobile • Objects move from node to node

  3. Model D • Cache coherence protocol • Locate the current copy • Move and invalidate • Metric • Location aware C A B

  4. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  5. Motivation • The contention manager guarantees atomicity • Should be obstruction free • Performance goals • Makespan • Competitive ratio • Makespan of optimal

  6. Transactional memory proxy • Local request: • Local object – return copy • Remote object – locate with Ballistic • Remote request: • Object not in use – invalidate copies and send • Object in use – abort or postpone response • Commit: • No invalidations – commit • Invalidations – abort

  7. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  8. Hierarchical clustering L=3 2 1 0

  9. Hierarchical clustering L=3 • Level 0 • Physical nodes are leaf nodes • x and y are connected iff d(x,y) < 21 • Leader0 is the maximal independent set 2 1 0

  10. Hierarchical clustering L=3 • Level l • Only nodes from leaderl-1 • x and y are connected iff d(x,y) < 2l+1 • Leaderl is the maximal independent set • Level L • Root • L ≤ log2Diam + 1 2 1 0

  11. Hierarchical clustering L=3 • Level l, nodex • Lookup parent set • Levell+1nodes within distance 10*2l+1 from x • Home parent • Closest lookup parent • Move parent set • Levell+1nodes within distance 4*2l+1 from x 2 x 1 0

  12. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  13. Publish() L=3 • Object at nodep • Create a single directed path from root to p • Homei(p).link = Homei-1(p) * We deal with a single object 2 1 0

  14. lookup() L=3 • Request at nodeq • Up phase • Homei-1(q) initiates a search for a non-null link at lookupProbei(q) • Down phase • Follow links to a leaf • Obtain copy or wait with leaf 2 1 0

  15. move() L=3 • Request at nodeq • Up phase • Homei-1(q) initiates a search for a non-null link at moveProbei(q) • Homei(q).link = Homei-1(q) • Redirect if found • Down phase • Follow links to a leaf • Erase links • Wait in queue 2 1 0

  16. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  17. a b Overtaking • Object at node p • a - 1st request • b - 2nd request • b enqueued first. L=3 2 1 0

  18. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  19. a b Finite write response time • Every move request is satisfied within timen * TE + n * TO from when it is generated • TE – maximum enqueue delay • TO – maximum time to reach a successor • n – number of nodes • Equivalent – finite read response time

  20. Proof • By time at most t+n * TE, either • All successor links between r and its n predecessors r1, r2, …, rnhave been established • There is k≤n-1, rkis p, the publish request. • At least two requests ri, rjcome from the same node • They are different (Lemma 1) • One was satisfied • The object reached a predecessor

  21. Proof • Let x be the location of the object at time t+n * TE • r is at most n steps away from x by taking the successor links • r will have the object by time at most t+n * TE + n * TO

  22. Bounded overtaking (corollary) • (Every move request is satisfied within timen * TE + n * TO from when it is generated) • Request r is generated at time t • All requests generated after time t+n * TE will be ordered after r • All requests generated prior to time t-n * TE will be ordered before r

  23. Lemma 1 • There exists no set of finite number of requests R={r1, r2, …, rf} whose successor links form a cycle • r’s arrow: a downward link added by r’s visit • Outside arrows: established by requests outside R P C

  24. Invariants • The root always has an arrow • Requests see an arrow at the peak level before the down phase • During the down phase, requests see an arrow until they reach a leaf • r’s arrow at level i points to C=homei-1(r)

  25. Invariants • r adds and arrow PC at time t • At time t –, r added an arrow to a grandchild C • At time t+ that arrow will be erased by r’ • r’ reached C from P • r’ erased r’s arrow PC • During [t -,t+], C always has an arrow • May be redirected from one grandchild to another P C

  26. Proof • H: the highest peak level reached by requests in R • The first request to reach H sees an outside arrow • We show: • in any level l<H some request from R sees an outside arrow • That request is queued behind an outside request

  27. Proof (by induction ) P C • Base: at level H • Step: • At time t, r in R sees an outside arrow at level k in node P. • The arrow was established by x not in R. • PC, C is x’s home directory • x also established C at level k-1, at time t – • At time t+, r reaches C. Either • Another request from R sees C during [t -,t+] • r sees C at time t+

  28. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  29. a b Overtaking revisited • Object at node p • a - 1st request • b - 2nd request • b enqueued first. • What if a’s priority is higher? L=3 2 1 0

  30. Intuitively… • Optimal schedule • Minimum cost Hamiltonian path • Visit each node once • Greedy schedule worst case • Tx aborted by all higher priority Txs • Each abort requires a move() • Node with timestamp k visited k times

  31. Performance • Work • An operation’s communication overhead • Distance • The cost of communicating directly from the requesting node to its destination • Stretch • work/distance • Executions can be sequential or concurrent

  32. Performance • Publish cost • The publish operation has work O(Diam) • Move cost • If an object has moved a combined distance d since its initial publication, the amortized move stretch is O(min{log2d,L})

  33. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  34. Performance • Publish cost • The publish operation has work O(Diam)

  35. 1. Bounded link property • The metric distance between a level-l child and its level-(l+1) parent is less than or equal to cb * 2l, for some constant cb. • x and y are connected iff d(x,y) < 2l+1

  36. 2. Constant expansion property L=3 • Any node has no more than a constant number of lookup parents and lookup children (ce) * This property requires a constant doubling metric 2 1 0

  37. Constant doubling dimension Metric: distances between all pairs, non-negative, triangle inequality BallBu(r) = { v | d(u,v) ≤ r } 2αballs of radius r/2 cover ball of radius r Doubling dimension:αis constant Based on “Ad Hoc Sensor Networks” by Roger Wattenhofer

  38. 3. Lookup property L • For any two leaves p,q • pl: any of p’s level-l ancestors by following move parents only. • If pl is not in lookupProbel(q) d(p,q) ≥cl * 2l, for some constant cl pl pl … 0 p q cl * 2l

  39. 4. Move property L • For any two leaves p,q • pl: p’s level-l home directory • If pl is not in moveProbel(q) d(p,q) ≥cm * 2l, for some constant cm pl … 0 p q cm * 2l

  40. Lemma 3 • There exists a constant cw such that for any operation that peaks at level l, the work it performs is at most cw * 2l • Proof • Bounded link  link cost cb * 2l • Constant expansion  number of steps in each level

  41. Publish performance • The publish operation has work O(Diam) • Publish operations peak at level L • The work is≤ cw * 2L (Lemma 3) • L ≤ log2Diam + 1

  42. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  43. Performance • Move cost • If an object has moved a combined distance d since its initial publication, the amortized move stretch is O(min{log2d,L})

  44. Lemma 4 L • q • move request • sequential execution • Discovers a non-null link • at node P at level l • p • move/publish request • last to visit P • d(p,q) ≥cm * 2l-1 P … 0 p q cm * 2l-1

  45. Proof L • The non null link at P points to homel-1(p) • homel-1(p).link is non nullat least until q removes its link (Invariant 5) • It was there during q’s up phase • q did not visit homel-1(p)going up level l-1 • Move property:d(p,q) ≥cm * 2l-1 P homel-1(p) … 0 p q cm * 2l-1

  46. Lemma 5 • Distance of a sequential execution • Sum of distances for all move operations • In a sequential execution with distance d,the maximum level reached byany move request doesnot exceed min(log2d+c,L) • c is a constant log2d d

  47. Proof • q0 - the initial publish request • l – highest level reached by a move request • q – the request that peaked at level l (first) • l ≤ L • q saw a non null link at level l • established by q0 • d(q,q0) ≥cm * 2l-1 (Lemma 4) • d ≥ d(q,q0) • l ≤Log2(d/cm)+1 l q0 q d

  48. Move performance • Lemma 6 • For any sequential execution αwork(α) ≤ (cw/cm) * l(α) * distance(α) • l(α) – the maximum level reached by a move request of execution α • By Lemma 5 • distance(α)= d l(α) ≤min(log2d+c,L) • The amortized move stretch is O(min{log2d,L}) No proof here…

  49. Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary

  50. Additional results • Move cost • Amortized move stretch is O(min{log2d,L}) for concurrent executions as well • Idea • “Lock” critical section in each level • Prevent neighbors from “stealing” links

More Related