860 likes | 873 Views
Remote Reference Counting Distributed Garbage Collection with Low Communication and Computation Overhead. www.cs.technion.ac.il/~assaf/publications/gc.ps. Distributed Systems. Consist of nodes: Lowest level: local address space Next level: disk partition, processor Top level: local net
E N D
Remote Reference CountingDistributed Garbage Collection with Low Communication and Computation Overhead www.cs.technion.ac.il/~assaf/publications/gc.ps
Distributed Systems • Consist of nodes: • Lowest level: local address space • Next level: disk partition, processor • Top level: local net • Interaction through message passing • Failures: • Due to hardware or software problems • Disconnection: due to network overload, reboot...
Distributed GC • Motivations: • Transparent object management • Storage management is complex - not to be handled by users • Goals: • Efficiency • Scalability • Fault tolerance
Distributed GC • The main problem: • A section of GC code running on one node must verify that no other node needs an object before collecting it • Result: • Many modules must cooperate closely, leading to a tight binding between supposedly independent modules
Distributed GC • Problems with simple approaches: • Determining the status of a remote node is costly • Asynchronous systems inconsistent data • Failures
Remote References • Terminology: • Owner - node which contains the object • Client - node which has a reference to the object • Creation: • A reference to an object crosses node boundaries • Side effect of message passing • Duplication: • Client of a remote object sends to a receiver node a reference to that object
Naive Reference Counting • Keep a reference count for each object • Upon duplication or creation, inform the owner to update the counter, by sending him a control message • Problems: • Increases communication overhead • Loss or duplication of messages • Race between decrement/increment messages
&V +1 -1 Race Conditions in Naive Reference Counting:Decrement/Increment RA RC X U RB V Counterv = 1
&V +1 -1 Race Conditions in Naive Reference Counting:Increment/Decrement RA RC X U RB V Counterv = 1
&V +1 ack 2 Avoiding Race by Acknowledge Messages RA RC X U RB V Counterv = 1
Weighted Reference Counting • Each object referenced has a partial weight and a total weight • Object creation: • total weight = partial weight = even value > 0 RB V Total = 64 Partial = 64
&V/16 16 Partialv = 16 Weighted Reference Counting:Reference Duplication partial weight halved and sent with the reference RA RC X U Partialv = 32 RB V Totalv = 64 Partialv = 32
-16 48 Weighted Reference Counting:Reference Deletion • partial weight sent to owner and subtracted from total weight RA RC X U Partialv = 16 Partialv = 16 RB V Totalv = 64 Partialv = 32
Weighted Reference Counting • Invariant: total weightv = partial weightv • When total weight = partial weight there are no remote references • Advantage: Eliminates increment messages, and therefore race conditions
Weighted Reference Counting • Shortcomings: • Weight underflow • Possible solutions: • Use partial weights which are powers of 2, keep only the exponent • [Yu-Cox] “Stop the world”, last resort global trace • Not resilient to message loss or duplication: • Loss may cause garbage objects to remain uncollected • Duplication may cause an object to be prematurely collected
Indirect Reference Counting • Stub contains strong and weak locators • Strong: refers to a scion in the sender node; used only for distributed GC • Weak: refers to the node where target object is located; used to invoke target object in a single hop • Duplication performed locally without informing the owner node • The weak reference is sent along with the message containing the reference
strong locator &scionB, &scionA 1 VA weak locator Indirect Reference Duplication RA RC X U VA stub RB scion 1 V
strong locator 1 VA weak locator Indirect Reference Deletion RA RC X U VA stub RB scion 1 V
-1 1 Indirect Reference Deletion RA RC X U VA stub RB scion 1 V
Indirect Reference Deletion RA RC X U VA stub RB scion 1 V
Indirect Reference Counting • Advantages: • Unlimited number of duplications • Access to object in one hop through weak locator • Disadvantages: • Not resilient to message failures • Messages are sent whenever an object is deleted
XB Cx stub scion Reference Listing • The object’s owner allocates a table of outgoing pointers (scions), one for each client that owns a reference to the object • Client nodes hold tables of incoming pointers (stubs) RA Z XB RC RB Ax X Y object
Sent delete X/1 XB Cx Sent &X/2 stub scion Use of Timestamps RC RB X Y object Sent &X/1 Ignored Received delete X/1
Reference Listing • Advantages: • Resilience to message duplication when timestamps are used • Resilience to node failure: Owner can prompt client to send a live/delete message • Owner may explicitly query about a reference that is suspected to be part of a distributed garbage cycle • Owner can decide whether to keep objects referred to by a crashed client node until it recovers or not • Disadvantages: • Memory overhead • Doesn’t collect cycles of garbage
Remote Reference Counting • Advantages: • Depends only on the number of nodes in the system • Independent of pointer operations • Independent of heap size • Messages are sent only during GC, when the chance of collecting an object is very high • Independent of consistency protocols and global order of operations
Remote Reference Counting • Disadvantages: • Doesn’t collect cycles of garbage • Dependent on the number of nodes in the system
The System Model • Communication through a reliable asynchronous message-passing system • Messages are never lost, duplicated or altered • Messages can be delayed or arrive out of order • Processors can share objects • Objects can be replicated
Local and Remote Counters • Local and remote counters are attached to every shared object • Locali(X) • Increased by m when node i receives a message containing m pointers to X • Otherwise maintained as in traditional reference counting • When Locali(X) = 0, i is clean - has no references to X
Local and Remote Counters • Remotei(X) • Increased by m when some object Y containing m pointers to X is sent from node i • Decreased by m when some object Y containing m pointers to X is received at node i • The sum of Remotei(X) is the number of pointers to X in transit in the system
The Algorithm - Layout • Build a spanning tree covering all the nodes • Collection of object X: • The root send signals to all its children • Inner nodes pass the signal down • When a leaf is clean it sends up a token • An inner node sends up a token when it received tokens from all its children and is clean • When the root received tokens from all its children it checks a condition C: • If C = true X is garbage • Otherwise - another wave begins
Signals 0 a node with local(x) = a The Algorithm
0 1 Tokens 0 0 0 1 0 0 0 0 0 a node with local(x) = a The Algorithm
Tokens The Algorithm R = R0 all the nodes outside S are clean 0 1 S 0 0 0 1 0 0 0 0 0 a node in S - hasn’t sent a token
Example: R0 falsification 0 1 S Y:=Z j 0 0 1 0 0 0 0 0 a node in S - hasn’t sent a token
Z Example: R0 falsification Locali(x) = 1 Remotei(x) = 1 0 i XZ 1 Localj(x) = 2 Remotej(x) = -1 S Y:=Z j 0 0 1 0 0 0 0 0 a node in S - hasn’t sent a token
The Algorithm • Use the remote counter to count pointers sent and received • idefinition: • for a node i outside S, i is the value held at remotei(X) when i sent its token • for a node i in S, i is the value held at remotei(X) • = i • fin = at the end of the wave
The Algorithm • A leaf sends in the token the value of its remote counter • An inner node sends up the sums of its remote counter and those of its descendants • R1 > 0 • R = R0 R1
Example (cont.) Locali(x) = 1 Remotei(x) = 1 0 i XZ 1 Localj(x) = 2 Remotej(x) = -1 S Y:=Z j 0 0 1 0 0 0 0 0 = 1 R1 is true
Example: R1 Falsification Locali(x) = 1 Remotei(x) = 1 0 k i XZ W:=Y Localj(x) = 2 Remotej(x) = -1 S Y:=Z j 0 0 1 0 0 0 0 0
Y Example: R1 Falsification Locali(x) = 1 Remotei(x) = 1 0 Localk(x) = 2 Remotek(x) = -1 k i XZ W:=Y Localj(x) = 2 Remotej(x) = 0 S Y:=Z j 0 0 1 0 0 0 0 0 = 0 R1 is false
The Algorithm • Detect if may have decreased due to a node in S: • Initially paint all nodes in white • A node that decreases remote(X) turns black • R2 at least one node in S is black • R = R0 R1 R2
Y Example: R2 Falsification Locali(x) = 1 Remotei(x) = 1 0 k Localk(x) = 2 Remotek(x) = -1 i XZ W:=Y Localj(x) = 2 Remotej(x) = 0 S Y:=Z j 0 0 1 0 0 0 0 0
Example: R2 Falsification Locali(x) = 1 Remotei(x) = 1 0 k Localk(x) = 2 Remotek(x) = -1 i XZ W:=Y Localj(x) = 2 Remotej(x) = 0 S Y:=Z j 0 0 1 0 0 0 0 0
Token Example: R2 Falsification Locali(x) = 1 Remotei(x) = 1 0 k Localk(x) = 0 Remotek(x) = -1 i XZ Localj(x) = 2 Remotej(x) = 0 S Y:=Z j 0 0 1 0 0 0 0 0 No node is S is black R2 is false
The Algorithm • Propagate the color information: • A node that is black or has received a black token transmits a black token • Otherwise, transmits a white token • A node that transmits a black token becomes white • R3 some node in S has a black token • R = R0 R1 R2 R3
Token Example (cont.) Locali(x) = 1 Remotei(x) = 1 0 k Localk(x) = 0 Remotek(x) = -1 i XZ Localj(x) = 2 Remotej(x) = 0 S Y:=Z j 0 0 1 0 0 0 0 0
TheAlgorithm • C = [S = {root} root is white and localroot(X) = 0 all tokens at the root are white fin = 0] • Once the root received tokens from all its children and localroot(x) = 0 it checks C: • C = true object X is garbage • Otherwise - the root becomes white and initiates another wave
Correctness Proof • Layout: • Show that R = (R0 R1 R2 R3) is invariant • C = true (R1 R2 R3) = false R0 = true object X is garbage
R0R1R2R3 is invariant • Assume by negation R is false • Look at the wave in which R first became false: • R = false R0 = false some node outside S was dirty • i = the first node outside S to become dirty • Case 1: R became false before i first became dirty • Implies that some node became dirty before i - impossible by definition of i
R0R1R2R3 is invariant • Case 2: R became false after i first became dirty • i received a message containing a pointer to X after sending its token • case 2.1: the message was sent in a previous wave • More pointers sent than received g > 0 at the beginning of the wave • If g doesn’t decrease R1 = true • Otherwise: some node becomes black R2 R3 = true