220 likes | 369 Views
Automatic Heap Sizing: Taking Real Memory into Account. Ting Yang , Emery Berger, Matthew Hertz, Scott Kaplan ¶ , Eliot Moss Department of Computer Science Department of Computer Science ¶ University of Massachusetts Amherst College
E N D
Automatic Heap Sizing:Taking Real Memory into Account Ting Yang, Emery Berger, Matthew Hertz, Scott Kaplan¶, Eliot Moss Department of Computer Science Department of Computer Science¶ University of Massachusetts Amherst College Amherst MA 01003 Amherst MA 01002 {tingy,emery,hertz,moss}@cs.umass.edu sfkaplan@cs.amherst.edu
Problem & Motivation Too large: page a lot Optimal Too small: GC a lot Heap size vs Running time Appel _213_javac 60MB real memory
Problem & Motivation • Multiprogramming makes it harder: • Amount of available real memory changes • Impossible to select heap size a priori • Strategy: adjust adaptively during execution
Outline • Problem & Motivation • Paging behavior of Garbage Collection • VMM: collecting the information we need • Collector: adjusting heap size adaptively • Experimental results • Conclusion & Future work
GC Paging Behavior • For strategy to work, need to relate: • GC algorithm, heap size, and footprint • Analysis methodology: • Obtain reference trace: simulate Jikes RVM under DSS • Process with LRU stack # faults at all memory sizes • GCs and programs traced: • Mark-Sweep (MS), Semi-Space (SS), and Appel GCs • SPECjvm98, ipsixql, and pseudojbb benchmarks
Heap size= extreme paging substantial paging: “looping” behavior 50 seconds fits into memory # of Faults ≈ 1000 Heap Size = 240Mb 0.5 second Memory = 145Mb Fault curve: Relationship of heap size, real memory & page faults
Page fault threshold = Our definition of footprint: The amount of memory needed so that the time spent on page faults is lower than a certain percentage of total execution time Relationship between heap size and footprint
A Linear Model:Heap size vs. Footprint • Heap footprint model: • Heaputil : SS 0.5; MS: 1 • base : Jikes RVM plus live data size • How the GC can use this model:
VMM ( in OS) Knows memory allocation / available memory Needs to track/calculate application footprint Garbage Collector (in User Space) Has ability to change heap size Needs info: available memory, footprint Strategy Overview Request: Send: mem alloc mem alloc footprint footprint
Outline • Problem & Motivation • Paging behavior of Garbage Collection • VMM: collecting the information we need • Collector: adjusting heap size adaptively • Experimental results • Conclusion & Future work
m m m m m m m m m m m m m m m m c h i j k l n k f n c b c d e g l f k g l n a j b a b e c e i g h a i h d j j l f n l k j i h b g n a e d c b a f d i h e k g h i k c k l n a b c d e f g f c n l l n k a i b f j 0 g h d e 0 0 b f c g j h a k l l n c i 4 3 2 1 1 0 c d n d e f 0 0 0 0 0 e l n k l n c 0 d b j k 0 0 n 0 g n a i k l n c h l k n j n l k 0 Approach to Measuring Footprint Memory reference sequence LRU Queue Pages in Least Recently Used order 1 14 Hit Histogram Associated with each LRU position 5 1 Fault Curve 1 4 11 14 5 pages 12 pages
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VMM design: SegQ [SKW’99,KMC’02] Strict LRU CLOCK algorithm LRU Queue Hot set Cold set Evicted set Minor fault (in memory) Major fault (on disk) Hit Histogram Footprint Hot / Cold Boundary • Decay histogram periodically • Adaptive control of hot set size
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VMM design: SegQ [SKW’99,KMC’02] Strict LRU CLOCK algorithm Hot set Cold set Evicted set Footprint Hot / Cold Boundary What is the footprint w.r.t 5%?
Outline • Problem & Motivation • Paging behavior of Garbage Collection • VMM: collecting the information we need • Collector: adjusting heap size adaptively • Experimental results • Conclusion & Future work
Collector Design: • Communicate with VMM after GC • First GC: • Appel, SS: HeapUtil = 0.5 • Following GCs: • Calculate HeapUtil from history
Outline • Problem & Motivation • Paging behavior of Garbage Collection • VMM: collecting the information we need • Collector: adjusting heap size adaptively • Experimental results • Conclusion & Future work
Experiments • Experimental setting: Jikes RVM 2.0.3 • Dynamic SimpleScalar: extended with new VMM model • Major fault = 5 million instructions = 5 ms @ 1 Gips • Minor fault = 2000 instructions = 2 µs • Page fault cost threshold = 5% - 10% • Histogram collecting cost threshold = 1% • Adapting to fixed memory pressure • Adapting to dynamic memory pressure • Add/Remove 15MB real memory after 2 billion insts
Appel _213_javac 60MB Real Memory Paging a lot Larger heap Fewer GCs Less Paging Memory under-utilized Optimal heap
Appel _213_javac 60MB Real Memory Increase memory: 15MB at 2 billion instructions
Appel _213_javac 60MB Real Memory Decrease memory: 15MB at 2 billion instructions
Conclusion: Automatic Heap Sizing • New collector usually picks heap size that: • Maximizes memory utilization (reducing GCs) • While avoiding paging • Linear model works well in practice • Improves performance by up to 8x under pressure • Cost of collecting information is low: around 1% • New collector adapts quickly to steady and to changing real memory allocations • Within 1 or 2 major GCs
Ongoing Work • Implement in real kernel • Extend to more collectors • Adjust during allocation, not just after GC Detailed graphs & tech report: http://www-ali.cs.umass.edu/~tingy/CRAMM