280 likes | 287 Views
This paper presents a method using generational garbage collection to organize data for improved cache locality. It discusses copying GC, generational GC, real-time profiling, affinity graphs, and a cache-conscious copying algorithm.
E N D
Using Generational Garbage Collection To Implement Cache-conscious Data Placement Trishul M. Chilimbi & James R. Larus מציג : ראובן ביק
Introduction • Main memory access cost is increasing • goal : to improve cache locality • introducing a technique for using a (copying) generational GC to reorganize data, so that objects with high temporal affinity are placed next to each other and thus are likely to reside in the same cache block
Contents • background • copying GC • generational GC • the method • profiling instrumentation • affinity graph • algorithm steps • results & conclusions
Copying GC • Two memory areas • When FROMSPACE is full, moves all the live objects from FROMSPACE to TOSPACE
Copying GC (cheney algorithm) • Breadth-first scan of the tree • one continuous scan of TOSPACE
Why Generational GC ? • Most objects live a very short time, while a small percentage of them live much longer • problem : repeated copying of old objects
Generational GC • segregating objects into multiple areas by age • scavenge older objects less frequently • copy long surviving objects to older generations
Real time data access profiling • Real time profiling is more effective then an earlier training run • must be low overhead • done by modified compiler • upon access , the object address is written into an object access buffer
Profiling is low-overhead • Implemented at object, not field, granularity • most object accesses are not lightweight
Affinity graph • Is based on object access buffer • created prior to scavenge • separate graph for each generation • nodes=objects • edges=affinity between objects
ADFGDACCAFDGAC A 1 D
ADFGDACCAFDGAC A 1 F 1 1 D
ADFGDACCAFDGAC A 1 F 1 1 1 G D 1
ADFGDACCAFDGAC A 1 F 1 1 2 G D 2
Cache-conscious copying algorithm Step1: • From the set of roots, pick the one with the highest affinity edge • perform a greedy depth first traversal of the affinity graph starting from this node • while traversing, copy each visited object to TOSPACE
Cache-conscious copying algorithm Step2: • Process all objects between the unprocessed and free pointers, using Cheney algorithm
Cache-conscious copying algorithm Step3: • Cleanup : copy any roots not present to TOSPACE and process using Cheney algorithm
this algorithm is not used in the youngest generation (where new objects are allocated and most of the garbage is generated)
Results • Tested on 5 Cecil language programs • on a Sun computer with 2GB memory, with 2 level cache, running Solaris
Conclusions • This is an attractive technique that reduces cache miss rates by 21-42% and improves program performance by 14-37% , as compared to the commonly used alternative.