1 / 22

A Study of Java Object Demographics

A Study of Java Object Demographics. Richard Jones Chris Ryder Computing Laboratory University of Kent. Overview. Motivation & Contribution Object demographics examples Data capture Clustering Program inputs Calling context Related work. Object segregation Generations Older-First

gusty
Download Presentation

A Study of Java Object Demographics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent ISMM 2008, Tucson, AZ

  2. Overview • Motivation & Contribution • Object demographics examples • Data capture • Clustering • Program inputs • Calling context • Related work ISMM 2008, Tucson, AZ

  3. Object segregation Generations Older-First Immortal region Large object areas Idea: Segregate objects by age, size, type, mortality, etc. Collect regions under different policies and mechanisms. Choice of GC Select the best GC for the application a priori. Hot-swap running GC. Idea: Different applications have different demographics. Respond to phase changes. Tomorrow Never Dies? Exploiting program behaviour ISMM 2008, Tucson, AZ

  4. Dr. No ‘one size fits all’ • Most systems manage objects uniformly • E.g. allocate all objects in a nursery and collect all nursery objects at the same time, promoting to the same older generation. • Pre-tenuring GC uses a very simple classification • E.g. short-lived, long-lived, immortal. • Contributions • A detailed study of Java object demographics reveals • A richer landscape than short/long/immortal. • Distinct behaviour of application, library and JVM objects. • Clusters of allocation sites, stable across program inputs. • A small number of clusters dominate. • Context is an important predictor for library allocation. ISMM 2008, Tucson, AZ

  5. Lifespan No go area ToD < Age Age ToD The Living Daylights _213_javac, speed 100 Compiles JavaLex scanner 4 times. 85% of these objects very short-lived. A few are immortal. Some survive to the end of the phase. A few are long lived. • Compiles JavaLex scanner 4 times. • Allocates [char] for a GNU classpath internal String constructor. • 6% of total allocation. ISMM 2008, Tucson, AZ

  6. Die Another Day • DaCapo hsqldb, default input. • 4 sites • 17% volume • 95% space rental • [volume x lifetime] • Scarcely any objects are very short-lived. ISMM 2008, Tucson, AZ

  7. Live And Let Die • DaCapo fop,default input • 18 sites • 19% volume • 8.29% short-lived • 9.27% immortal • 16% space rental ISMM 2008, Tucson, AZ

  8. For Your Eyes Only • MemTrace compiles method to… • Record allocation sites. • Modify allocation routines. • Tag object header with site & position in calling context tree. • Emit allocation record. • Benefit: same framework as for method specialisation [ISMM07]. • MemTrace profiles using… • Baseline compiler — focus on application objects. • Forced full collections (64K granularity). • GCspy framework to log death events. • Exaggerates lifetimes of short-lived objects. ISMM 2008, Tucson, AZ

  9. Casino Royale • Aim • Characterise lifetimes of objects allocated by a site. • Identify sites with similar lifetimes. • We call the cumulative frequency curve the lifetime distribution function(ldf) of the site. • Expect collaborating sites to have similar ldf’s. ISMM 2008, Tucson, AZ

  10. From Russia With Love • Compare ldf’s statistically for some confidence n% • Kolmogorov-Smirnov Two Sample test • D = the maximum difference between 2 frequency distributions Ei(t) • p(D is significant) < n? • Benefit: non-parametric, distribution-free, cheap. ISMM 2008, Tucson, AZ

  11. Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ

  12. Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ

  13. Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ

  14. Thunderball • Greedy, gravitational clustering ISMM 2008, Tucson, AZ

  15. DaCapo: all allocation Immortal clusteralways cluster 0 ISMM 2008, Tucson, AZ

  16. DaCapo: application packages Immortal clusteralways cluster 0 ISMM 2008, Tucson, AZ

  17. You Only Live Twice • Does an allocation site generate the same lifetimebehaviour regardless of input? • Do allocation sites share the same cluster from one input to another? • i.e. continue to behave in the same way as each other? • Compare cluster membership with Adjusted Rand Index ISMM 2008, Tucson, AZ

  18. The World Is Not Enough? • Earlier studies • Java: site is sufficient [Blackburn et al, OOPSLA01] • C: more context required [Zorn & Seidl, ASPLOS98] • Calling context • <site+method0, method1, method2, …> • Increasing depth of context splits an ldf into 1 or more. • Compare the variance of site ldf’s • Variance of program = weighted sum of the variances of its ldf’s ISMM 2008, Tucson, AZ

  19. Context (2) Context Variance as a multiple of depth =  All • Application Jikes RVM Library ISMM 2008, Tucson, AZ

  20. A View To A Kill • Related work: choice of GC • Fitzgerald and Tarditi [ISMM00]. • Hot-swapping: Printezis [JVM01]; Soman, Krintz, Bacon [ISMM04]; Singer, Brown & Watson [ISMM07]. • Thomas [Inf Proc Letters '95] tailors GC to the program. • Demographics • Dieckman and Holzle [ECOOP98] focus on reference densities, proportion of arrays, etc. • DaCapo [OOPSLA06] characterise benchmarks by heap-related metrics. • Pretenuring - Cheng, Harper, Lee [PLDI98], Harris [ISMM00]; Blackburn et al [TOPLAS07]; Marion, Jones, Ryder [ISMM07]. • Merlin [SIGMETRICS02]. ISMM 2008, Tucson, AZ

  21. Conclusions • No one size of collector fits all. • Programs exhibit only a few distinct object lifetime distributions. • These are richer than short/long/immortal. • A very small number of clusters dominate. • Clusterings are stable across inputs. • Calling context is important for libraries. http://www.cs.kent.ac.uk/projects/gc/demographics ISMM 2008, Tucson, AZ

  22. Questions? ISMM 2008, Tucson, AZ

More Related