1 / 30

Efficient Memory Management in Programming Languages

Explore various memory management strategies in programming languages such as C, C#, and Java, including garbage collection, reference counting, reachability trees, mark and sweep, and copy collectors. Learn about advantages, disadvantages, and implementation details.

Download Presentation

Efficient Memory Management in Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Management Tom Roeder CS215 2006fa

  2. Motivation • Recall unmanaged code • eg C:{double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++) { A[i] = i; }} • What’s wrong? • memory leak: forgot to call free(A); • common problem in C

  3. Motivation • What’s wrong here?char* f(){char c[100]; for(int i = 0; i < 100; i++) { c[i] = i; } return c;} • Returning memory allocated on the stack • Can you still do this in C#? • no: array sizes must be specified in “new” expressions

  4. Motivation • Solution: no explicit malloc/free (new/delete) • eg. in Java/C#{double[] A = new double[M*N]; for(int i = 0; i < M*N; i++) { A[i] = i; }} • No leak: memory is “lost” but freed later • A Garbage collector tries to free memory • keeps track of used information somehow

  5. COM’s Solution • Reference Counting • AddRef/Release • each time a new reference is created: call AddRef • each time released: call Release • must be called by programmer • leads to difficult bugs • forgot to AddRef: objects disappear underneath • forgot to Release: memory leaks • Entirely manual solutions unacceptable

  6. Garbage Collection • Why must we do this in COM? • no way to tell what points to what • C/C++ pointers can point to anything • C#/Java have a managed runtime • all pointer types are known at runtime • can do reference counting in CLR • Garbage Collection is program analysis • figure out properties of code automatically • two type of analysis: dynamic and static

  7. Soundness and Completeness • For any program analysis • Sound? • are the operations always correct? • usually an absolute requirement • Complete? • does the analysis capture all possible instances? • For Garbage Collection • sound = does it ever delete current memory? • complete = does it delete all unused memory?

  8. Reference Counting • As in COM, keep count of references. How? • on assignment, increment and decrement • when removing variables, decrement • eg. local variables being removed from stack • know where all objects live • at ref count 0, reclaim object space • Advantage: incremental (don’t stop) • Is this safe? • Yes: not reference means not reachable

  9. Reference Counting • Disadvantages • constant cost, even when lots of space • optimize the common case! • can’t detect cycles • Has fallen out of favor. 1 2 1 1 Reachable

  10. Trees • Instead of counting references • keep track of some top-level objects • and trace out the reachable objects • only clean up heap when out of space • much better for low-memory programs • Two major types of algorithm • Mark and Sweep • Copy Collectors

  11. Trees • Top-level objects • managed by CLR • local variables on stack • registers pointing to objects • Garbage collector starts top-level • builds a graph of the reachable objects

  12. Mark and Sweep • Two-pass algorithm • First pass: walk the graph and mark all objects • everything starts unmarked • Second pass: sweep the heap, remove unmarked • not reachable implies garbage • Soundness? • Yes: any object not marked is not reachable • Completeness? • Yes, since any object unreachable is not marked • but only complete eventually

  13. Mark and Sweep • Can be expensive • eg. emacs • everything stops and collection happens • this is a general problem for garbage collection • at end of first phase, know all reachable objects • should use this information • how could we use it?

  14. Copy Collectors • Instead of just marking as we trace • copy each reachable object to new part of heap • needs to have enough space to do this • no need for second pass • Advantages • one pass • compaction • Disadvantages • higher memory requirements

  15. Fragmentation • Common problem in memory schemes • Enough memory but not enough contiguous • consider allocator in OS 10 10? 10 5 15 10 5

  16. Unmanaged algorithms • best-fit • search the heap for the closest fit • takes time • causes external fragmentation (as we saw) • first-fit • choose the first fit found • starts from beginning of heap • next-fit • first-fit with a pointer to last place searched

  17. Unmanaged algorithms • worst-fit • put the object in the largest possible hole • under what workload is this good? • objects need to grow • eg. database construction • eg. network connection table • different algorithms appropriate in different settings: designed differently • in compiler/runtime, we want access speed

  18. Heap Allocation Algorithms • Best for managed heap? • must be usually O(1) • so not best or first fit • use next fit • walk on the edge of the last chunk • General idea • allocate contiguously • allocate forwards until out of memory

  19. Compacting Copy Collector • Move live objects to bottom of heap • leaves more free space on top • contiguous allocation allows faster access • cache works better with locality • Must then modify references • recall: references are really pointers • must update location in each object • Can be made very fast

  20. Compacting Copy Collector • Another possible collector: • divide memory into two halves • fill up one half before doing any collection • on full: • walk the trees and copy to other side • work from new side • Need twice memory of other collectors • But don’t need to find space in old side • contiguous allocation is easy

  21. C# Memory management • Related to next-fit, copy-collector • keep a NextObjPointer to next free space • use it for new objects until no more space • Keep knowledge of Root objects • global and static object pointers • all thread stack local variables • registers pointing to objects • maintained by JIT compiler and runtime • eg. JIT keeps a table of roots

  22. C# Memory management • On traversal: • walk from roots to find all good objects • linear pass through heap • on gap, compact higher objects down • fix object references to make this work • very fast in general • Speedups: • assume different types of objects

  23. Generations • Current .NET uses 3 generations: • 0 – recently created objects: yet to survive GC • 1 – survived 1 GC pass • 2 – survived more than 1 GC pass • Assumption: longer lived implies live longer • Is this a good assumption? • good assumption for many applications • and for many systems (eg. P2P) • Put lower objects lower in heap

  24. Generations • During compaction, promote generations • eg. Gen 1 reachable object goes to Gen 2 • Eventually: } Generation 0 } Generation 1 } Generation 2 Heap

  25. More Generation Optimization • Don’t trace references in old objects. Why? • speed improvement • but could refer to young objects • Use Write-Watch support. How? • note if an old object has some field set • then can trace through references

  26. Large Objects Heap • Area of the heap dedicated to large objects • never compacted. Why? • copy cost outweights any locality • automatic generation 2 • rarely collected • large objects likely to have long lifetime • Commonly used for DataGrid objects • results from database queries • 20k or more

  27. Object Pinning • Can require that an object not move • could hurt GC performance • useful for unsafe operation • in fact, needed to make pointers work • syntax: • fixed(…) { … } • will not move objects in the declaration in the block

  28. Finalization • Recall C++ destructors:~MyClass() { // cleanup} • called when object is deleted • does cleanup for this object • Don’t do this in C# (or Java) • similar construct exists • but only called on GC • no guarantees when

  29. Finalization • More common idiom:public void Finalize() { base.Finalize(); Dispose(false);} • maybe needed for unmanaged resources • slows down GC significantly • Finalization in GC: • when object with Finalize method created • add to Finalization Queue • when about to be GC’ed, add to Freachable Queue

  30. Finalization images from MSDN Nov 2000

More Related