350 likes | 441 Views
An Implementation of Mostly-Copying GC on Ruby VM. Tomoharu Ugawa The University of Electro-Communications, Japan. Background(1/2). Script languages are used at various scene Before: only for tiny applications Short lifetime Runs with little memory
E N D
An Implementation of Mostly-Copying GC on Ruby VM Tomoharu Ugawa The University of Electro-Communications, Japan
Background(1/2) • Script languages are used at various scene • Before: only for tiny applications • Short lifetime • Runs with little memory ⇒GC (Garbage Collection) was not important • Now:for servers such as Rails, as well • May have long lifetime • May create a lot of objects ⇒GC has a great impact on total performance
Background(2/2) • Ruby’s GC • Conservative Mark-Sweep GC⇒Does not move objects • Once we expanded the heap, we can hardly shrink the heap • Heap cannot release unless it contains NO object • Lucky cases rarely happen Ex) Once a server uses a lot of memory for a heavy request, it will run with a large heap even after responding the request. Live Additional Heap 2 Initial Heap Additional Heap 1
Goal • Compact the heap so that Ruby can return unused memory to OS. • Use Mostly-Copying GC • Modify the algorithm for Ruby • Minimize the change of C-libraries
Agenda • Shrinking the heap using Mostly-Copying GC • Modified Mostly-Copying algorithm • Evaluation • Related work • Conclusion
Why Ruby does not move objects? • move -> have to update pointers to the moving object • Ruby’s GC does not recognize all pointers to Ruby objects • In the C runtime stack • In regions allocated using “malloc” by C-libraries ⇒Cannot update such pointers Ambiguous pointer (blue arrow) move Ambiguous root(cloud mark) Exact pointer
Even so, we CAN move most objects • We can update pointerscontained in Ruby objects • Objects referred only fromRuby objects can be moved • Most objects are referred only from Ruby Objects Most objects can be moved This is the basic idea of the Mostly-Copying GC
Mostly-Copying GC [Bartlett ’88] • Objects referred only by exact pointers⇒Move it and update referencing pointers • Objects referred by ambiguous pointers (as well)⇒Do not move it
The heap of Mostly-Copying GC • Break the heap into equal-sized blocks • From-space of copying GC is a set of blocks From From To root To To From
Shrinking the heap Free blocks are not contiguous in mostly-coping collector • Release memory by the block • Block = hardware page • To release a block, do not access the block • Because such a blocks has no live object, all we have to do is not to allocate new objects on the block • Virtual memory system automatically reuses the page frame assigned to the block • (optional) We can tell the OS that the page has no valid data • madvise system call (Linux)
C-libraries • C-libraries wraps “malloc”-ed data to handle as Ruby objects. A wrapper object has: • A pointer to “malloc”-ed area • A function that “marks” objects referred from the data • NO pointer updating interface Treat all pointers from“malloc”-ed dataas ambiguous pointers traverse(data) { mark(data->p1); mark_location(…); } p1
Agenda • Shrinking the heap using Mostly-Copying GC • Modified Mostly-Copying algorithm • Evaluation • Related work • Conclusion
Mostly-Copying GC of Bartlett • Objects referred only from exact pointers⇒Copy it to to-space • Objects referred from ambiguous pointers⇒Move the containing block to to-space logically(they call this promotion) • The algorithm may encounter new ambiguous pointers. Pointed object may have been copied. • Bartlett’s algorithm copies all objects even if they are pointed by ambiguous pointers. • Objects in blocks promoted are eventually written back from their copies.
Problem • Memory efficiency • Copy objects even referred by ambiguous pointers • Garbage in promoted pages is not collected root
Problem • Memory efficiency • Copy objects even referred by ambiguous pointers • Garbage in promoted pages is not collected root
Problem • Memory efficiency • Copy objects even referred by ambiguous pointers • Garbage in promoted pages is not collected root
Modify the algorithm • Mark-Sweep GC before Copying • Mark: find out ambiguous root • Objects referred by ambiguous pointers no more be copied • Sweep (only promoted block) • Each block has a free-list • All Ruby objects are 5 words=> Do not cause (external) fragmentation
Modified Algorithm(1/4) • Trace pointers from the root set • Mark all visited objects • Promote blocks containing objects referred by ambiguous pointers root Promoted(thick border) Live mark
Modified Algorithm(2/4) • Sweep promoted blocks • Collect objects that are not marked root
Modified Algorithm(3/4) • Copying GC (Using promoted block as the root set) • Do not copy objects in promoted blocks root
Modified Algorithm(4/4) • Scan promoted blocks to erase mark of each objects 空き root 空き 空き
The only change of C-libraries • Mark-array • An array that has the same pointers held in “malloc”-ed data • The C-library marks only the mark-array • The collector can traverse further • But, it cannot recognize they are ambiguous pointers • Remember: all pointers from “malloc”-ed data are treated as ambiguous ones • Impact • 2 modules • 3 parts Change C-libraries so that THEY scan mark-array as ambiguous roots
Ruby VM YARV r590 (This is old but has essentially the same GC as Ruby 1.9) Items Heap size Elapsed time Environment CPU: Pentium 3GHz OS: Linux 2.6.22 compiler: gcc 4.1.3 (-O2) Evaluation
Benchmark Program 2.times { ary = Array.new 10000.times { |i| ary[i] = Array.new (1..100).each {|j| ary[i][j-1] = 1.to_f / j.to_f } if (i % 100 == 0) then CP() end } 10000.times { |i| ary[i] = nil if (i % 100 == 0) then CP() end } 30000.times { |i| 100.times{ “” } if (i % 100 == 0) then CP() end } } Increases live objects(processing heavy req.) Profiling the heap by each100 loops checkpoints Decreases live objects(end of heavy req.) Make short-live objects(series of ordinary requests)
Heap size (MB) Traditional VM Our VM Checkpoint Black line: amount of live objects
Relative elapsed time of our VM(Relative to traditional VM) (%) Average (except for thread):102%
Related work • Customizable Memory Management Framework [Attardi et. al ’94] • Collect garbage by sweeping promoted blocks • Ambiguous pointer are found out during copying • Copies of objects that has been copied when the collector recognizes they should not be copied will become garbage • Our algorithm detects such objects before copying
Related work • MCC [Smith et. al ’98] • Pins objects referred from ambiguous root • Always manage locations of ambiguous root by a list • C-libraries have to register/unregister ambiguous root each time they “malloc”/”free” • Our algorithm finds ambiguous root by tracing at the beginning of GC
Related work • Ruby 1.9 • Reduce the size of additional heap to 16KB(i.e., heap is expanded by the 16KB block) • Increase the opportunity for releasing • Objects become distributed all over the heap as execution advances • We compact the heap
Conclusion • Implemented mostly-copying GC on Ruby VM • Modify the algorithm for memory efficiency • Evaluated its implementation • Shirked the heap after those phases of a program where it temporary uses a lot of memory • Elapsed time to execute benchmarks is comparable to traditional VM
Heap size (with Ruby 1.9) (MB) Ruby 1.9 YARV Increase astime spends(even Ruby 1.9) Our VM checkpoint Black line: amount of live objects
Benchmark Program 2 2.times { ary = Array.new 10000.times { |i| ary[i] = Array.new (1..100).each {|j| ary[i][j-1] = 1.to_f / j.to_f } if (i % 100 == 0) then CP() end } 10000.times { |i| ary[i] = nil if (i % 100 == 0) then CP() end } 30000.times { |i| 100.times{ “” } if (i % 100 == 0) then CP() end } } sum = 0 ary[i].each {|x| sum+=x} ary[i] = sum Make some long-lifetimeobjects during decreasingphase
Heap size (benchmark 2) (MB) Ruby 1.9 YARV Our VM checkpoint
Relative elapsed time of the VM with Bartlett’sAlgorithm. (Relative to traditional VM) (%)
Related work • Generational GC for Ruby [Kiyama ’01] • Generational Mark-Sweep GC • Reduced GC time • Uses much memory • All objects have extra two words (double-linked list) for representing generations • Mostly-Copying GC can divide space for generations [Bartlett et. al ’89]