180 likes | 263 Views
No Bit Left Behind: The Limits of Heap Data Compression. Jennifer B. Sartor* Martin Hirzel † , Kathryn S. McKinley*. *U Texas at Austin, † IBM Watson. Current State. Managed languages ubiquitous. Embedded devices. Multicore. Need memory efficiency!.
E N D
No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel†, Kathryn S. McKinley* *U Texas at Austin, †IBM Watson ISMM 2008
Current State • Managed languages ubiquitous • Embedded devices • Multicore • Need memory efficiency! ISMM 2008
Memory Efficiency of Managed Languages • COST • 8-94% information content in heap in 37 benchmarks. [Mitchell & Sevitsky, OOPSLA 07] • Boxed objects • Trailing zeros in arrays • Redundant objects • Extra bit-width • Data structure back-bones bzip2 86% • OPPORTUNITY • Memory layout abstraction • (Location + size) != identity ISMM 2008
Related Work ISMM 2008
Limit Study • Quantitatively compare heap data compression • Surveyed literature • Savings equations • Methodology for evaluation • Apples-to-apples comparison • Future work: implementation • Hybrid techniques 58% • Findings: array & hybrid compression ISMM 2008
Hybrid Array Compression • Redundancy • Equal array sharing ISMM 2008
Equal Object Sharing • Marinov & O’Callahan. OOPSLA 03; Appel & Goncalves. Tech Report 93 • Two objects are equal if both • Same class & all fields have same value • Strictly-equal: pointer fields identical • Deep: objects pointer targets are equal • JVM store only 1 copy in hashtable 14% • Class C, N objects, D distinct; save: ISMM 2008
Hybrid Array Compression • Redundancy • Equal array sharing • Value set indirection Dictionary: ISMM 2008
Value Set Indirection & Caching • Cooprider & Regehr/ Titzer, et al. PLDI 07 • For object field or array elements with large range of values • Dictionary (or cache) of 256 most frequent values, instance stores small 1 byte indices • If > 256 values, 255 in dictionary, 256th says to store rest (M) in hashtable w/ objectID 14% ISMM 2008
Hybrid Array Compression 2 • Remove zeros • Trim trailing zeros • Bit width reduce • Zero compress ISMM 2008
Zero-based Object Compression • Chen, et al. OOPSLA 03 • Remove bytes that are entirely zero • Per object bit-map: 1 bit per byte • Store only non-zero bytes • Savings: 45% ISMM 2008
Hybrid Array Compression 2 • Remove zeros • Trim trailing zeros • Bit width reduce • Zero compress ISMM 2008
Model 1 … t s – Heap dumpseries Analysisrepresentation Limit savings Program run Model n Methodology Garbage Collection snapshot ISMM 2008
Experimental Details • Jikes Research Virtual Machine • Java-in-Java • DaCapo benchmarks + pseudojbb • 20-25 heap snapshots per benchmark • MarkSweep with 2x min heap • Analysis • Per class • Objects and arrays separated • JVM+app vs application (separated in paper) • Per heap snapshot, and over all snapshots ISMM 2008
Value Indirection & Cache Deep Equal Sharing Zero Compression Hybrid Compression ISMM 2008
Stability of Savings fop: snapshots over time ISMM 2008
Conclusions • Limit study compare apples-to-apples heap data compression techniques • Potential to reduce memory inefficiencies in managed languages • Arrays • Hybrids • Future: save space • Challenge: efficient detection & recovery Thank you! ISMM 2008