350 likes | 490 Views
CPSC 388 – Compiler Design and Construction. Heap Management. Areas of Memory Used by Program. Program Code Static Data Heap Stack . Heap. Used for dynamically allocated memory Important operations include allocation and de-allocation
E N D
CPSC 388 – Compiler Design and Construction Heap Management
Areas of Memory Used by Program • Program Code • Static Data • Heap • Stack
Heap • Used for dynamically allocated memory • Important operations include allocation and de-allocation • In C++, Pascal, and Java allocation is done via the “new” operator • In C allocation is done via the “malloc” function call • De-allocation is done either automatically or programmer must specify when to de-allocate memory: • Pascal and C++ – dispose • C – free • Java – garbage collection
Managing the Heap • Available memory is managed using a free list: a list of available “chunks” • Each chunk includes: • Size of chunk • Address of the next item on the free list • The chunk itself
Initial Heap Free List 0 4 … 103 100 \ … First Free size next Request is made to allocate 20 bytes Uses first portion of first chunk (after size Field) and returns address of 4
Initial Heap Free List 0 4 … 23 24 28 … 103 20 \ 76 First Free size next size Request is made to allocate 10 bytes
Initial Heap Free List 0 4 … 23 24 28 …37 38 42 … 103 20 10 \ 62 First Free size next size size First chunk is freed Adds chunk to front of free list
Initial Heap Free List 0 4 … 23 24 28 …37 38 42 … 103 20 10 \ 62 First Free size next size size
Operations on Free List • Request space • Find a satisfactory chunk • Free Space • Return to Free List • Goals for Operations • Only fail to satisfy request for n bytes if there are not n bytes available on free list • Do both operations quickly
Questions to Consider • Given a request for n bytes, which n bytes to return? • Given a de-allocation of a chunk, how to coalesce it with neighboring free chunks?
Techniques for Allocation • Best Fit: Find the chunk on the freelist with the smallest size greater than or equal to allocation request • May require search of entire freelist (SLOW!) • Leaves lots of little pieces of free storage on the list
Techniques for Allocation • First Fit: Use the first chunk with size greater than or equal to n. • Faster than best-fit. • Produces little pieces of free storage at the front of the list, which slows later searches
Techniques for Allocation • Circular First Fit: Make the freelist circular (i.e. have last item point back to the first item). • Satisfy requests using the first chunk with size greater than or equal to n. • Change the freelist pointer to point to next chunk after allocated one.
Techniques for de-allocation • Use a doubly-linked list • Each Chunk has a previous and next pointer • One bit of size field reserved to indicated if chunk is “free” or “in-use”. • Check free bit of storage after chunk • If following chunk is free then coalesce • Follow Example on Board
Techniques for De-allocation • Can also coalesce with preceding chunk if you keep the size of chunk at beginning and end of chunk • Follow example on board • Note that NO pointers need to be updated
Automatic or Explicit De-allocation • In C++ and C de-allocation must be done explicitly • In Java de-allocation is done automatically (by the garbage collector) • Making it Automatic reduces burden on the programmer (and eliminates some types of errors)
Errors of Explicit De-allocation • Storage Leaks Some storage is never freed even though it is inaccessible Listnode *p = malloc( sizeof(Listnode) ); . . // no copy from p in this code . p = ...;
Errors of Explicit De-allocation • Dangling pointers • A pointer that points to memory that has been freed • May read garbage • May mess up free list • May corrupt other variables
Example Dangling Pointers Listnode *p, *q; p = malloc( sizeof(Listnode) ); q = p; . . // no assignment to q in this code . free(p); . . // no assignment to q in this code . *q = ...
Detecting Dangling Pointers • Add a new field to every allocated chunk (like size field) (lock) • Add a new field to every pointer (in addition to storing the address) (key) • If lock does not match key then throw an error
Detecting Dangling Pointers • Each free chunk’s lock is set to 0 • When allocated both lock and key assigned a new value (always increasing) • When storage is freed set lock back to zero • When pointer is dereferenced, compiler generates code to first match key to lock, otherwise cause error
Automatic De-allocation • Determine if a chunk of storage is no longer accessible to the program • Make de-allocation efficient, avoid long pauses in program’s execution during de-allocation • Two Approaches: • Reference Counting • Garbage Collection
Reference Counting • Include invisible field in every chunk of storage: its reference count field. • Value of field is the number of pointers that point to the chunk. • Value is initialized to 1 when chunk is allocated and updated: • When a pointer is copied, a new reference is created, so the reference count of chunk must be incremented • When a non-null pointer’s value is over-written, a reference is removed, so the reference count of the chunk (before the over-write) must be decremented. • When a reference count becomes zero, it means nothing points to it so the chunk can be de-allocated and added to free list. If the chunk contains pointers to other chunks, then their reference counts must be decrimented.
Problems with Reference Counting • Slows Program Execution • Every write into a pointer must test to see if old value is null. • Requires updates to reference counts • Cyclic Structures cannot be deallocated var p: Nodeptr; /* p is a pointer to a node */ new(p); /* p points to new storage, reference count is 1 */ p^.next = p; /* next field of node points to node, so now reference count is 2 */ p = nil; /* p's value is over-written, so node's reference count decremented(from 2 to 1) In fact, it is inaccessible (it points to itself, no other pointer points to it), but we can't tell that just from the reference count. */
Garbage Collection • Wait until no stoarge left then • Find all accessible objects • Free all other (inaccessible) objects • Several Approaches to Garbage Collection • Mark and Sweep • Stop and Copy
Mark and Sweep • Two Phases • Mark phase finds and marks all accessible objects • Sweep phase sweeps through the heap, collecting all of the garbage and putting back on freelist • Another “invisible” value in each chunk called mark bit • Initialized to 0 • Set to 1 if the chunk is reached during mark phase
Mark Phase Put all “active” pointers on a worklist (“active” means pointer is on stack or static data area) While worklist is not empty do: p=select_pointer(worklist) if p’s object’s mark-bit is zero: change it to one put all pointers in p’s object on worklist
Sweep Phase • Looks at every chunk of storage in heap • How? • If mark-bit for chunk is 0 add to freelist • If mark-bit for chunk is 1 change to 0 • When adding to freelist coalesce neighbor chunks • See example on board
Stop and Copy Garbage Collection • Heap is divided into two parts: • Old space used for allocation of new chunks • New space used for garbage collection • First-free pointer points to first free space in old space • When allocation request is made for n bytes, if space is available in old space then make allocation, otherwise perform garbage collection
Stop and Copy Garbage Collection • Find all accessible objects (following same method as mark and sweep) • Copy the object from old space to new space (no mark bit) • After making all copies, reverse role of old and new space • First-free pointer points to beginning of the “new” old space
Stop and Copy Garbage Collection • When chunk is copied from old to new, ALL pointers to chunk must be updated • A forwarding pointer is left behind in old space and used to update other pointers to same object • Follow example on board
Advantages of Stop and Copy • Allocation is Cheaper (no need for searching free list, just advance first-free pointer) • No Freelist, just one chunk of free memory, no need to coalesce chunks • Cheaper than mark and sweep – no need to scan entire heap • Compacting objects means closer together (fewer cache misses, fewer page faults)
Identifying Pointers • Automatic deallocation requires the ability to find all pointers on the stack • Every word has a one-bit tag (0 for not-pointer, 1 for pointer) • Maintain separate bit-map of tags • Associate with each variable and each object a type tag.
Summary • Two methods of Storage De-allocation • Programmer controlled • Automatic • Programmer controlled errors include: • Storage leaks • Corrupted memory via dangling pointers • Automatic De-allocation • Reference counting • High space and time overhead • Cannot free cyclic structures • Cost is distributed over the execution of program • Garbage collection • Mark and Sweep • Stop and Copy