190 likes | 342 Views
More Distributed Garbage Collection. Chapter 4: Naming (2). Reference Listing. Idea: have the skeleton keep track of the processes which reference it. Skeleton must keep more information, but references can be verified.
E N D
More Distributed Garbage Collection Chapter 4: Naming (2)
Reference Listing • Idea: have the skeleton keep track of the processes which reference it. Skeleton must keep more information, but references can be verified. • Concept: Idempotent operations: adding a proxy (process) to a RL when the proxy is already in the RL is idempotent. Deleting a proxy that is not in the RL is idempotent. Thus these operations can be done multiple times without harm. • So detection of duplicates in messaging can be relaxed. (Could still be a problem if P1 creates—deletes—creates in quick succession.) • Java RMI does this.
Reference Listing(2) • In Reference Listing, if P1 wants to create a reference to object O: (1) P1 sends its ID to the skeleton for O. (2) O’s skeleton adds P1 to the RL and acks P1. (3) When P1 gets ack, it creates proxy for O. RL
Reference Listing (3) • If P1 wants to copy the reference for P2: (1) P1 sends reference info about O to P2. (2) P2 sends message to O’s skeleton requesting add to RL. (3) Skeleton for O adds P2 and acks P2. (4) When P2 gets ack, it can install proxy. RL
Reference Listing (4) • If P1 wants to pass its reference to P2 (and delete its own) the process is the same, except can be problem (race condition) if delete request from P1 arrives at O before add request from P2 gets there. • Soln: P1 must wait for ack from P2 that process is complete before requesting delete. RL
Reference Listing • Advantage: RL can be checked in case of failure or suspected failures: O just pings the processes in its RL (can’t be done with just counts). • Disadvantage: extra comm costs to handle race condition. The RL’s can get long – use a lot of memory, therefore, does not scale well. • Solution to scalability: Many of those processes on the long RL don’t really need the reference anymore, so encourage processes to get off the list when not needed: Skeleton registers references for a specific time period. This is called a LEASE and the concept is used in many situations.
Unreachable Objects: section 4.3.4 • Previous algorithms deal with unreferenced objects, but objects may be referenced but only by other unreachable objects. • Problem: Objects that reference each other but none of them are reachable from the “root set” ie, valid users and processes. (zombie process problem).
Trace Based Garbage Collection • Idea: Check which entities can be reached from the root set (by tracing from the root set) and remove all others. • Centralized solution is Mark and Sweep. • Mark all objects white • Trace pointers starting at the root set. Gray indicates in-process. • Mark black all objects reachable from root set. • When finished, delete white objects.
Distributed Mark and Sweep (1) 1. Each process runs local GC. Initially all proxies, skeletons and objects are marked white. 2. Objects at node P reachable from a root at P are marked gray. When an object is marked gray, its skeleton and all proxies in it are marked gray (objects containing pointers or references are our concern). 3. When a proxy is marked gray, a message is sent to the associated skeleton to mark itself gray. 4. All objects whose skeletons are gray are marked gray. Root set
Distributed Mark and Sweep (2) 5. Whenever an object and its skeleton are gray and all proxies in the object have informed their skeletons, the object becomes black and a message is sent (backwards) to its associated proxies that it is now black. (this requires that a skeleton can contact the proxies which reference it.) 6. When a proxy receives a message that its associated skeleton is black, it turns black. 7. Mark phase ends when all proxies, skeletons and objects are either white or black. White objects can then be removed (and skeleton and proxies within the object). Disadvantage: reachability graph must remain unchanged during algorithm. Root set
Tracing in Groups (1) • To address scalability issues, introduce tracing-in-groups. A group is a collection of processes, that may or may not be on the same node (all processes on the same node might be in the same group – or all processes running the same application thus using the same objects). • First, collect all garbage within the group, then combine groups that have been internally cleaned. • The skeleton maintains a reference counter RC which counts the number of associated processes. • Assumption: a process has no more than one proxy for each distributed object.
Tracing in Groups (2) • A skeleton can be marked soft or hard. A soft skeleton is reachable only from proxies within the group. A hard skeleton is reachable from a proxy outside the group or from a root set object within the group. The marking of a skeleton can only change from soft to hard, that is, once a skeleton is hard, it cannot be changed to soft. H S S Root set
Tracing in Groups (3) • A proxy can be hard, soft, or none. A proxy that is hard is reachable from an object in the root set. A soft proxy is reachable from a skeleton that is marked soft and is potentially not reachable from the root set. None means it is neither reachable from a soft skeleton nor an object in the root set at this time. Only proxies marked none can be changed to hard (no change from soft to hard). Note that a reachable proxy from a skeleton means the proxy is within the object of the skeleton. H H S S S S Root set
Tracing in Groups(4): the Algorithm • Step 1: Mark skeletons soft or hard as follows: Look at reference counter for O. Say count is R. Count processes in the group that have proxies that refer to object O. If number of proxies is R, mark skeleton soft, since it is referred to only by proxies in the group. Else, there is a reference from outside the group, so mark the skeleton HARD. S H S S S Root set or out of group
Tracing in Groups(5): the Algorithm • 2: Each process in the group propagates marks from skeletons to other proxies in the same process or object. (process may contain objects and proxies). Initially all proxies are labeled none. Trace from skeletons marked hard as well as from the root set. Hard marks are propagated to all objects and proxies within the process that are reachable from the hard set. Now trace the skeletons marked soft. A proxy that was none can change to soft. A hard proxy does not change. S H H S S S S S S Root set or out of group
Tracing in Groups(6): the Algorithm • 3: Propagate HARD marks from proxies to their skeletons in other processes within the group. Soft marks do not propagate. • Repeat step 2 and 3 if some skeleton changed from SOFT to HARD. • *Garbage Collect SOFT proxies and their skeletons and objects. Combine some groups and start over. H H H S S S S S S Root set or out of group
Tracing in Groups (7) Initial marking of skeletons.
Tracing in Groups (8) After local propagation in each process.
Tracing in Groups (9) Final marking.