1 / 28

Reducing Pause Time of Conservative Collectors

Reducing Pause Time of Conservative Collectors. Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo). Incremental GC for soft-realtime applications [Steele 75] [Yuasa 90] [Doligez 93]. Target: Multimedia, game etc. Pauses should be <10ms

alaula
Download Presentation

Reducing Pause Time of Conservative Collectors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reducing Pause Time of Conservative Collectors Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo)

  2. Incremental GC for soft-realtime applications [Steele 75] [Yuasa 90] [Doligez 93] • Target: Multimedia, game etc. • Pauses should be <10ms • Collection tasks are divided into small pieces • Success: Pauses of <5ms [Cheng 01] • They assume compiler cooperation • Reduction of pause for ‘conservative’ GCs is insufficient

  3. Conservative GC [Boehm et al. 88] • Mark sweep GC for C/C++ programs • No compiler cooperation (e.g., write barriers) Mostly parallel GC [Boehm et al. 91] • Incremental, conservative • Pauses >100ms fairly common

  4. Write barriers in conservative GCs • No fine-grain write barrier by compiler VM’s write protection Coarse grain • Page level • Detect only first update after protection Restrict design

  5. Incremental mark sweep algorithms • Snapshot at beginning&DLG [Yuasa 90] [Doligez 93] • Make (conceptual) heap snapshot before marking • Promise short pause • Large space overhead with VM write barrier • Incremental update [Steele 75] [Dijkstra 78] • Maintain consistency after marking Need final marking before finish Unlimitedly long! Only choice With VM

  6. Contributions • Analyze why previous algorithms fail • Propose techniques to bound pauses & guarantee progress • Show a `stress-test’ benchmark: iukiller • Demonstrate experimental results • < 5ms in applications • < 12ms in the stress-test benchmark (constant across all heap sizes) (This talk omits parallel issues)

  7. Overview of presentation • Mostly parallel GC • Techniques to reduce pause time • Experimental results • Related work • Summary

  8. Mostly parallel garbage collector (1) Start GC Write-protect heap write fault Trap handler Remember dirty (=updated) pages addr. Unprotect Incremental mark User Final marking Incremental sweep User End GC

  9. Mostly parallel garbage collector (2) • Second update is un-trapped • Mark r in final phase p p p q r write q r write q r Need final marking

  10. root heap Final marking • Scan all dirty pages + root • Mark all unmarked objects from scanned region The amount of work is unbounded • # of dirty pages • Objects reachable from a dirty page Makes pauses >100ms

  11. Overview of presentation • Mostly parallel garbage collector • Techniques to reduce pause time • Experimental results • Related work • Summary

  12. Goal of our collector • Bound pause time (< constant) • Mutator utilization is important, but focus on pause • Guarantee progress of collection Combine two techniques: • Bound dirty pages (BD) • Retry incremental marking (RI)

  13. Bounding dirty pages (1) • Basic collector produces many dirty pages • Keep # of dirty pages < a given limit • If exceeds limit, choose a dirty page • Re-protect, scan, clean it • Good: Reduce task in final marking • Bad: More protection cost

  14. Bounding dirty pages (2) • Is pause now bounded? … No! • Unmarked objects reachable from a dirty page are not bounded root heap

  15. Retrying incremental marking (1) Keep works of final marking < a given limit Start GC Write-protect heap Trap handler Incremental mark User Final marking No. Retry! Finished before limit? Yes. Incremental sweep User End GC

  16. Retrying incremental marking (2) • Good: Bound length of single final marking • Bad: Risk of starvation (no progress) • Final marking may abort before finishing scanning (unbounded) dirty pages • Unmarked objects may ‘escape’ from collector

  17. The worst case • Abort a final marking with no progress Incr. finishes Incr. Final aborts write Incr. finishes Incr. Final aborts write

  18. Ensuring bounded pause and progress • Either is insufficient… • Need two techniques: • Bounding dirty pages (BD) • Retrying incremental marking (RI) • BD  Every final marking can scan all dirty pages It finds some unmarked objects, if any

  19. Overview of presentation • Mostly parallel garbage collector • Techniques to reduce pause time • Experimental results • Related work • Summary

  20. Experimental Environments • 400MHz UltraSPARC, Solaris 8 • Four GCs • Stop: Stop-the-world GC • Basic: Basic incremental GC • BD: Use bounding dirty pages • BD+R: Use bounding dirty pages + retrying incremental marking Basic/BD/BD+R: GC starts when heap usage > 75% BD/BD+R: # of dirty pages < 16

  21. The iukiller synthetic benchmark ‘Stress-test’ benchmark for mostly parallel GC • Trees tend to escape from collector Final marking tends to be long root root repeat large binary trees

  22. Results of iukiller benchmark:the maximum pause time • Previous collectors fail • > 1.8 seconds • The larger the heap, the longer • BD+R achieves <12ms pause • independent from heap size

  23. Application benchmarks • Programs written in C/C++ • deltablue: an incremental constraint solver (25MB) • espresso: a logic optimizer for PLA (10MB) • N-Body: an N-Body solver with Barnes-Hut (15MB) • CKY: a context free grammar parser (40MB) • Cube: a Rubik’s cube puzzle solver (8MB)

  24. Results of application benchmarks:the maximum pause time BD+R achieves <5mspause in five applications BD is also OK (< 16ms) 215ms 283ms

  25. Results of application benchmarks: overhead Total execution times (‘Stop’=1) BD/BD+R is <9% slower than Basic • More protection All incr. GCs are 1—53% slower than Stop • VM write barrier • Floating garbage • More GC cycles

  26. Related work • [Appel et al. 88] • Copy GC with VM read barrier. Slower than write barrier • [Furuso et al. 91] • Snapshot-at-beginning on VM. Large space overhead • Recent version of [Boehm et al. 91] • Time limit on final marking. Risks of starvation • [Printezis et al. 00] [Ossia et al. 02] • Keep # of dirty cards small. Final marking is still unbounded

  27. Summary An incremental conservative GC • Short pause (<5ms in 5 applications) • GC progress Use both techniques: • Bounding dirty pages • Retrying incremental marking

  28. Future direction • Reducing overhead of BD • Strategy for proper limit for dirty pages • Bounding roots to be scanned • Protect stacks partially

More Related