550 likes | 700 Views
Distributed Transactional Memory. Presented by Gala Yadgar. Model. A network of nodes Transactions are immobile Objects move from node to node. Model. D. Cache coherence protocol Locate the current copy Move and invalidate Metric Location aware. C. A. B. Outline.
E N D
Distributed Transactional Memory Presented by Gala Yadgar
Model • A network of nodes • Transactions are immobile • Objects move from node to node
Model D • Cache coherence protocol • Locate the current copy • Move and invalidate • Metric • Location aware C A B
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Motivation • The contention manager guarantees atomicity • Should be obstruction free • Performance goals • Makespan • Competitive ratio • Makespan of optimal
Transactional memory proxy • Local request: • Local object – return copy • Remote object – locate with Ballistic • Remote request: • Object not in use – invalidate copies and send • Object in use – abort or postpone response • Commit: • No invalidations – commit • Invalidations – abort
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Hierarchical clustering L=3 2 1 0
Hierarchical clustering L=3 • Level 0 • Physical nodes are leaf nodes • x and y are connected iff d(x,y) < 21 • Leader0 is the maximal independent set 2 1 0
Hierarchical clustering L=3 • Level l • Only nodes from leaderl-1 • x and y are connected iff d(x,y) < 2l+1 • Leaderl is the maximal independent set • Level L • Root • L ≤ log2Diam + 1 2 1 0
Hierarchical clustering L=3 • Level l, nodex • Lookup parent set • Levell+1nodes within distance 10*2l+1 from x • Home parent • Closest lookup parent • Move parent set • Levell+1nodes within distance 4*2l+1 from x 2 x 1 0
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Publish() L=3 • Object at nodep • Create a single directed path from root to p • Homei(p).link = Homei-1(p) * We deal with a single object 2 1 0
lookup() L=3 • Request at nodeq • Up phase • Homei-1(q) initiates a search for a non-null link at lookupProbei(q) • Down phase • Follow links to a leaf • Obtain copy or wait with leaf 2 1 0
move() L=3 • Request at nodeq • Up phase • Homei-1(q) initiates a search for a non-null link at moveProbei(q) • Homei(q).link = Homei-1(q) • Redirect if found • Down phase • Follow links to a leaf • Erase links • Wait in queue 2 1 0
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
a b Overtaking • Object at node p • a - 1st request • b - 2nd request • b enqueued first. L=3 2 1 0
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
a b Finite write response time • Every move request is satisfied within timen * TE + n * TO from when it is generated • TE – maximum enqueue delay • TO – maximum time to reach a successor • n – number of nodes • Equivalent – finite read response time
Proof • By time at most t+n * TE, either • All successor links between r and its n predecessors r1, r2, …, rnhave been established • There is k≤n-1, rkis p, the publish request. • At least two requests ri, rjcome from the same node • They are different (Lemma 1) • One was satisfied • The object reached a predecessor
Proof • Let x be the location of the object at time t+n * TE • r is at most n steps away from x by taking the successor links • r will have the object by time at most t+n * TE + n * TO
Bounded overtaking (corollary) • (Every move request is satisfied within timen * TE + n * TO from when it is generated) • Request r is generated at time t • All requests generated after time t+n * TE will be ordered after r • All requests generated prior to time t-n * TE will be ordered before r
Lemma 1 • There exists no set of finite number of requests R={r1, r2, …, rf} whose successor links form a cycle • r’s arrow: a downward link added by r’s visit • Outside arrows: established by requests outside R P C
Invariants • The root always has an arrow • Requests see an arrow at the peak level before the down phase • During the down phase, requests see an arrow until they reach a leaf • r’s arrow at level i points to C=homei-1(r)
Invariants • r adds and arrow PC at time t • At time t –, r added an arrow to a grandchild C • At time t+ that arrow will be erased by r’ • r’ reached C from P • r’ erased r’s arrow PC • During [t -,t+], C always has an arrow • May be redirected from one grandchild to another P C
Proof • H: the highest peak level reached by requests in R • The first request to reach H sees an outside arrow • We show: • in any level l<H some request from R sees an outside arrow • That request is queued behind an outside request
Proof (by induction ) P C • Base: at level H • Step: • At time t, r in R sees an outside arrow at level k in node P. • The arrow was established by x not in R. • PC, C is x’s home directory • x also established C at level k-1, at time t – • At time t+, r reaches C. Either • Another request from R sees C during [t -,t+] • r sees C at time t+
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
a b Overtaking revisited • Object at node p • a - 1st request • b - 2nd request • b enqueued first. • What if a’s priority is higher? L=3 2 1 0
Intuitively… • Optimal schedule • Minimum cost Hamiltonian path • Visit each node once • Greedy schedule worst case • Tx aborted by all higher priority Txs • Each abort requires a move() • Node with timestamp k visited k times
Performance • Work • An operation’s communication overhead • Distance • The cost of communicating directly from the requesting node to its destination • Stretch • work/distance • Executions can be sequential or concurrent
Performance • Publish cost • The publish operation has work O(Diam) • Move cost • If an object has moved a combined distance d since its initial publication, the amortized move stretch is O(min{log2d,L})
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Performance • Publish cost • The publish operation has work O(Diam)
1. Bounded link property • The metric distance between a level-l child and its level-(l+1) parent is less than or equal to cb * 2l, for some constant cb. • x and y are connected iff d(x,y) < 2l+1
2. Constant expansion property L=3 • Any node has no more than a constant number of lookup parents and lookup children (ce) * This property requires a constant doubling metric 2 1 0
Constant doubling dimension Metric: distances between all pairs, non-negative, triangle inequality BallBu(r) = { v | d(u,v) ≤ r } 2αballs of radius r/2 cover ball of radius r Doubling dimension:αis constant Based on “Ad Hoc Sensor Networks” by Roger Wattenhofer
3. Lookup property L • For any two leaves p,q • pl: any of p’s level-l ancestors by following move parents only. • If pl is not in lookupProbel(q) d(p,q) ≥cl * 2l, for some constant cl pl pl … 0 p q cl * 2l
4. Move property L • For any two leaves p,q • pl: p’s level-l home directory • If pl is not in moveProbel(q) d(p,q) ≥cm * 2l, for some constant cm pl … 0 p q cm * 2l
Lemma 3 • There exists a constant cw such that for any operation that peaks at level l, the work it performs is at most cw * 2l • Proof • Bounded link link cost cb * 2l • Constant expansion number of steps in each level
Publish performance • The publish operation has work O(Diam) • Publish operations peak at level L • The work is≤ cw * 2L (Lemma 3) • L ≤ log2Diam + 1
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Performance • Move cost • If an object has moved a combined distance d since its initial publication, the amortized move stretch is O(min{log2d,L})
Lemma 4 L • q • move request • sequential execution • Discovers a non-null link • at node P at level l • p • move/publish request • last to visit P • d(p,q) ≥cm * 2l-1 P … 0 p q cm * 2l-1
Proof L • The non null link at P points to homel-1(p) • homel-1(p).link is non nullat least until q removes its link (Invariant 5) • It was there during q’s up phase • q did not visit homel-1(p)going up level l-1 • Move property:d(p,q) ≥cm * 2l-1 P homel-1(p) … 0 p q cm * 2l-1
Lemma 5 • Distance of a sequential execution • Sum of distances for all move operations • In a sequential execution with distance d,the maximum level reached byany move request doesnot exceed min(log2d+c,L) • c is a constant log2d d
Proof • q0 - the initial publish request • l – highest level reached by a move request • q – the request that peaked at level l (first) • l ≤ L • q saw a non null link at level l • established by q0 • d(q,q0) ≥cm * 2l-1 (Lemma 4) • d ≥ d(q,q0) • l ≤Log2(d/cm)+1 l q0 q d
Move performance • Lemma 6 • For any sequential execution αwork(α) ≤ (cw/cm) * l(α) * distance(α) • l(α) – the maximum level reached by a move request of execution α • By Lemma 5 • distance(α)= d l(α) ≤min(log2d+c,L) • The amortized move stretch is O(min{log2d,L}) No proof here…
Outline • Ballistic protocol • Hierarchical clustering • Operations • Requirements • Finite response time • Performance • Publish • Move • Additional results • Summary
Additional results • Move cost • Amortized move stretch is O(min{log2d,L}) for concurrent executions as well • Idea • “Lock” critical section in each level • Prevent neighbors from “stealing” links