180 likes | 195 Views
Naming. CSCI 6900/4900. Unreferenced Objects in Dist. Systems. Objects no longer needed as nobody has a reference to them and hence will not use them Garbage collection is necessary as objects consume resources
E N D
Naming CSCI 6900/4900
Unreferenced Objects in Dist. Systems • Objects no longer needed as nobody has a reference to them and hence will not use them • Garbage collection is necessary as objects consume resources • In non-distributed systems garbage collection is much simpler than distributed systems • Cross machine references, transitive references, references as parameters complicate matters • Objects referencing recursively but not by anyone else
The Problem of Unreferenced Objects • An example of a graph representing objects containing references to each other.
Reference Counting • Counting the number of references of an object • Simple reference counting works well for uni-processor systems, but encounters problems for distributed systems • The problem of maintaining a proper reference count in the presence of unreliable communication.
Copying References Across Processes • Race conditions can easily arise • Need careful design to overcome the problems • Copying a reference to another process and incrementing the counter too late • A solution.
Weighted Reference Counting • Designed to overcome race conditions of simple reference counting • Each object has a fixed total weight • When a remote reference is created, half the weight is assigned to the new reference • Object can be safely removed when the weight becomes zero • The initial assignment of weights in weighted reference counting • Weight assignment when creating a new reference.
Weighted Reference Counting - 2 • When a copy is created of a remote reference, the new copy receives half the weight • When a reference is deleted, its current weight is subtracted • Object can be safely removed when the weight becomes zero
Indirection in Weighted Referencing • Creating an indirection when the partial weight of a reference has reached 1.
Generation Reference Counting • Each proxy stores two numbers • A counter indicating the number of times the proxy has been copied • A generation number indicating how the proxy was created • Original skeleton maintains info. about copies for each generation • When a proxy is removed the original skeleton is informed about the generation number and number of copies
Reference Listing • All the above approaches assume reliable communication • Idempotent Operations – can be repeated safely without affecting the end result • Results depend on whether an operation has been invoked at all • How to make adding deleting references idempotent? • Maintain an explicit list of references rather than counter • Check for duplicates when performing operation
Identifying Unreachable Entities • Reference counting and listing cannot deal with unreachable entities with recursive referencing • Need techniques to check all entities for unreachability – Tracing based garbage collection • Mark & Sweep Collectors • Mark phase: Entities are traced from root set and marked • Sweep phase: Examine all entities and remove unmarked • Three coloring approach • Each entity is initially white • Entity is gray if found reachable but references need to be explored • Mark it black if reachable and explored completely • All white nodes are unreachable and can be removed
Distributed Mark & Sweep • Each process starts a local garbage collector • These garbage collectors run in parallel • Proxies, skeletons and objects are colored • Initially all proxies, skeletons and objects are white • When an object residing in process P is reachable from a root in P, it is marked grey • All proxies in the object are marked grey • When a proxy is marked grey, its skeleton is marked grey, which in turn marks the associated object grey • Local garbage collectors collect white objects • Un-scalable and requires reachability graph to remain unchanged – STOP & GO approach
Tracing in Groups • Designed to deal with the scalability issue • Processes are organized into groups for scalability purposes • Group is a collection of processes • Combination of Mark & sweep and reference counting • Basic idea – collect garbage within each group and then recursively consider larger groups • Assumption – Remote references are implemented as proxy-skeleton pair
Steps in Group Tracing • Initial Marking – Only skeletons are marked • Intra-process propagation – Propagate marks from skeletons to proxies • Inter-process propagation – Proxies to skeletons • Stabilization – repeat two steps until stability is reached • Garbage reclamation
Marks in Group Tracing • Skeletons – Hard or Soft • Hard implies skeleton is reachable from proxy outside the group, or object in rootset of a process • Soft implies reachable only from proxies within the group • Proxy – Hard, Soft or None • Hard implies proxy is reachable from object in root set • Soft implies proxy is reachable from skeleton marked as soft • None – Proxy is not reachable from skeleton or proxy in root set
Tracing in Groups (1) • Initial marking of skeletons.
Tracing in Groups (2) • After local propagation in each process.
Tracing in Groups (3) • Final marking.