240 likes | 394 Views
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms. Martin T. Vechev. David F. Bacon. Eran Yahav. University of Cambridge. IBM T.J. Watson Research Center. PLDI – June 2006. Why Concurrent Garbage Collection ?. Java and C#
E N D
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms Martin T. Vechev David F. Bacon Eran Yahav University of Cambridge IBM T.J. Watson Research Center PLDI – June 2006
Why Concurrent Garbage Collection ? • Java and C# • Garbage-collected languages are prevalent • Multicore • Concurrency is becoming prevalent • Cheap RAM • Large heaps are becoming prevalent • Real-Time Systems • More widely used
Existing Way to Create a Concurrent GC REQUIREMENTS ENVIRONMENT Throughput Memory Consumption Pause Time Memory Model Thread Model Concurrency Primitives CPU primitives TECHNIQUES Tracing/reference counting moving Allocate White / Black Dijkstra / Steele / Yuasa Barrier Atomic / Incremental Stack Snapshot Write Barrier Atomic / Non-atomic Color toggle, stacklets etc etc etc ?? • Hard to verify/test • Often buggy • Did the monkey • choose well?? Implementation
Incorrect Correct (C) Corrected Concurrent GC algorithms and proofs are hard Yuasa ‘90 Steele(C) ‘75 Dijkstra(C) ‘78 FAMILY Ben-Ari Base ‘84 Doligez(C) ‘93 Boehm ‘91 Ben-Ari Extended ‘84 Azatchi ‘03 Barabash ‘03 ALGORITHMS Domani ‘03 Ben-Ari Base ‘84 Doligez ‘94 Pixley ‘88 PROOFS THEOREM PROVING
Our Research Vision ENVIRONMENT (Declarative Specification) REQUIREMENTS Memory Model Thread Model Concurrency Primitives CPU primitives Throughput Memory Consumption Pause Time Automated System Formally Defined Techniques Optimal Correct Implementation
In This Work FIXED ENVIRONMENT REQUIREMENTS Memory Model Thread Model Concurrency Primitives CPU primitives Throughput Pause Time Memory Consumption Automated System Formally Defined Techniques for Tracing Non- Moving GC Algorithm 1 Algorithm2 Algorithm3 AlgorithmN … < < <
Traced Not Traced Problem : Interference SYSTEM = MUTATOR || GC B C A 1. GC traced B
Traced Not Traced Problem : Interference SYSTEM = MUTATOR || GC B B C C A A 1. GC traced B 2. Mutator links C to B
Traced Not Traced Problem : Interference SYSTEM = MUTATOR || GC B B B C C C X A A A 1. GC traced B 2. Mutator links C to B 3. Mutator unlinks C from A
Traced Not Traced Problem : Interference SYSTEM = MUTATOR || GC C LOST B B B B C C C C A A A A 1. GC traced B 2. Mutator links C to B 3. Mutator unlinks C from A 4. GC traced A
The 3 Families of Concurrent GC Algorithms 3. Rescan B when C is linked to B (STEELE) 1. Marks C when C is linked to B (DIJKSTRA) 2. Marks C when link to C is removed (YUASA) B B B B C C C C C X A A A • Solutions areapplied uniformly for all objects
Contributions • Systematic Exploration • A new parametric model of concurrent GC • Better understanding • New algorithms – potentially useful • Formal Relationship between algorithms • Space - Relative precision between algorithms • Sharing Proof Burden • Correctness-preserving “transformations”
A Parametric Concurrent GC Skeleton • Intuition : Common out as much as possible • Record interaction history between collector and mutator during tracing • Collector exposes “hidden objects” based on entire interaction history
A Parametric Concurrent GC Skeleton Complete Garbage Collection COLLECTOR mark Expose(L,D) mark Expose(L,D) reclaim … MUTATOR Change Heap Change Heap
C Dimensions: an intuition • The effect of each Mutator/GC action is controlled by a dimension A B X Collector Scans Pointer Wavefront Granularity Mutator Creates Pointer Counting Mutator Overwrites Pointer Snapshot Mutator Allocates Object Allocation Color
Implementation Choice: Wavefront • Per-Field Wavefront • Exact information • One bit per field • More expensive • More synchronization • More garbage collected • Per-Object Wavefront • Approximate Information • One bit per object • Less expensive • Less synchronization • Less garbage collected
Choice: Record on Link or Unlink X • Record on Link • More synchronization • More garbage collected • Record on Unlink • Less synchronization • Less garbage collected
Combined Choices A B A B X Per-Object WF A B A B X Per-Field WF Record on Link Record on Unlink
Combined Choices Per Object B A Per-Obj A Per-Obj B X X Per-Obj A Per-Field B X X X X Per-Field A Per-Obj B Per-Field A Per-Field B X X Rec. Link A Rec. Link B Rec. Link A Unlink B Rec. Unlink A Rec. Link B Rec. Unlink A Rec. Unlink B
Correctness • Transformations = Proof Steps APEX (U, U, U, U, {}) START WITH A CORRECT ALGORITHM RETAIN LESS GARBAGE STEELE DIJKSTRA (stacks,U,{},U,{}) STEELE-YC STEELE-D STEELE-D-YC STEELE-BC DIJKSTRA-OLD DIJKSTRA-YC DIJKSTRA-BC HYBRID-YC (stacks,A,{},{},{}) STEELE-D-BC RETAIN MORE GARBAGE YUASA (stacks, A, {}, {}, U)
Relative Precision • Intuition: an algorithm is more precise than another if it collects more garbage • An algorithm that is less precise (more conservative) than a correct algorithm is guaranteed to be correct • Should be a reference point for practical comparisons • no ad-hoc methods • Hard to do manually: need a tool to provide insights • Finding the “right” definition was harder than proving safety, yet simpler than “relative concurrency”
Precision APEX (U, U, U, U, {}) MORE PRECISE STEELE DIJKSTRA (stacks,U,{},U,{}) STEELE-YC STEELE-D STEELE-D-YC STEELE-BC DIJKSTRA-OLD DIJKSTRA-YC DIJKSTRA-BC HYBRID-YC (stacks,A,{},{},{}) STEELE-D-BC LESS PRECISE YUASA (stacks, A, {}, {}, U)
Systematic exploration of an algorithm space Useful new algorithms Formal definition of Relative precision between algorithms A first step towards automatic derivation of concurrent garbage collectors Conclusions