220 likes | 352 Views
A Study of Java Object Demographics. Richard Jones Chris Ryder Computing Laboratory University of Kent. Overview. Motivation & Contribution Object demographics examples Data capture Clustering Program inputs Calling context Related work. Object segregation Generations Older-First
E N D
A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent ISMM 2008, Tucson, AZ
Overview • Motivation & Contribution • Object demographics examples • Data capture • Clustering • Program inputs • Calling context • Related work ISMM 2008, Tucson, AZ
Object segregation Generations Older-First Immortal region Large object areas Idea: Segregate objects by age, size, type, mortality, etc. Collect regions under different policies and mechanisms. Choice of GC Select the best GC for the application a priori. Hot-swap running GC. Idea: Different applications have different demographics. Respond to phase changes. Tomorrow Never Dies? Exploiting program behaviour ISMM 2008, Tucson, AZ
Dr. No ‘one size fits all’ • Most systems manage objects uniformly • E.g. allocate all objects in a nursery and collect all nursery objects at the same time, promoting to the same older generation. • Pre-tenuring GC uses a very simple classification • E.g. short-lived, long-lived, immortal. • Contributions • A detailed study of Java object demographics reveals • A richer landscape than short/long/immortal. • Distinct behaviour of application, library and JVM objects. • Clusters of allocation sites, stable across program inputs. • A small number of clusters dominate. • Context is an important predictor for library allocation. ISMM 2008, Tucson, AZ
Lifespan No go area ToD < Age Age ToD The Living Daylights _213_javac, speed 100 Compiles JavaLex scanner 4 times. 85% of these objects very short-lived. A few are immortal. Some survive to the end of the phase. A few are long lived. • Compiles JavaLex scanner 4 times. • Allocates [char] for a GNU classpath internal String constructor. • 6% of total allocation. ISMM 2008, Tucson, AZ
Die Another Day • DaCapo hsqldb, default input. • 4 sites • 17% volume • 95% space rental • [volume x lifetime] • Scarcely any objects are very short-lived. ISMM 2008, Tucson, AZ
Live And Let Die • DaCapo fop,default input • 18 sites • 19% volume • 8.29% short-lived • 9.27% immortal • 16% space rental ISMM 2008, Tucson, AZ
For Your Eyes Only • MemTrace compiles method to… • Record allocation sites. • Modify allocation routines. • Tag object header with site & position in calling context tree. • Emit allocation record. • Benefit: same framework as for method specialisation [ISMM07]. • MemTrace profiles using… • Baseline compiler — focus on application objects. • Forced full collections (64K granularity). • GCspy framework to log death events. • Exaggerates lifetimes of short-lived objects. ISMM 2008, Tucson, AZ
Casino Royale • Aim • Characterise lifetimes of objects allocated by a site. • Identify sites with similar lifetimes. • We call the cumulative frequency curve the lifetime distribution function(ldf) of the site. • Expect collaborating sites to have similar ldf’s. ISMM 2008, Tucson, AZ
From Russia With Love • Compare ldf’s statistically for some confidence n% • Kolmogorov-Smirnov Two Sample test • D = the maximum difference between 2 frequency distributions Ei(t) • p(D is significant) < n? • Benefit: non-parametric, distribution-free, cheap. ISMM 2008, Tucson, AZ
Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ
Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ
Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ
Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ
DaCapo: all allocation Immortal clusteralways cluster 0 ISMM 2008, Tucson, AZ
DaCapo: application packages Immortal clusteralways cluster 0 ISMM 2008, Tucson, AZ
You Only Live Twice • Does an allocation site generate the same lifetimebehaviour regardless of input? • Do allocation sites share the same cluster from one input to another? • i.e. continue to behave in the same way as each other? • Compare cluster membership with Adjusted Rand Index ISMM 2008, Tucson, AZ
The World Is Not Enough? • Earlier studies • Java: site is sufficient [Blackburn et al, OOPSLA01] • C: more context required [Zorn & Seidl, ASPLOS98] • Calling context • <site+method0, method1, method2, …> • Increasing depth of context splits an ldf into 1 or more. • Compare the variance of site ldf’s • Variance of program = weighted sum of the variances of its ldf’s ISMM 2008, Tucson, AZ
Context (2) Context Variance as a multiple of depth = All • Application Jikes RVM Library ISMM 2008, Tucson, AZ
A View To A Kill • Related work: choice of GC • Fitzgerald and Tarditi [ISMM00]. • Hot-swapping: Printezis [JVM01]; Soman, Krintz, Bacon [ISMM04]; Singer, Brown & Watson [ISMM07]. • Thomas [Inf Proc Letters '95] tailors GC to the program. • Demographics • Dieckman and Holzle [ECOOP98] focus on reference densities, proportion of arrays, etc. • DaCapo [OOPSLA06] characterise benchmarks by heap-related metrics. • Pretenuring - Cheng, Harper, Lee [PLDI98], Harris [ISMM00]; Blackburn et al [TOPLAS07]; Marion, Jones, Ryder [ISMM07]. • Merlin [SIGMETRICS02]. ISMM 2008, Tucson, AZ
Conclusions • No one size of collector fits all. • Programs exhibit only a few distinct object lifetime distributions. • These are richer than short/long/immortal. • A very small number of clusters dominate. • Clusterings are stable across inputs. • Calling context is important for libraries. http://www.cs.kent.ac.uk/projects/gc/demographics ISMM 2008, Tucson, AZ
Questions? ISMM 2008, Tucson, AZ