230 likes | 258 Views
Explore the history, challenges, and solutions of dynamic storage allocation in programming. Learn about heap management, fragmentation, allocators, testing methods, and strategies for optimal data storage.
E N D
Dynamic Storage Allocation Bradley Herrup CS 297 Security and Programming Languages
Storing Information • Information needs to be stored somewhere • Two Areas of Storage • The Stack – Program Storage • Organized and Structured storage mechanism with a relative standardized layout • The Heap – Data Storage • Random allocation of space for the storage of Data maintained by the Operating System • Can exist in many different parts of memory it is a virtual construct that exists simultaneously in Cache and RAM
A Little History • 1960-1970 • Interested in the mapping of logical systems to physical memory • Developing a area to maintain data and separate the the OS from other programs • 1970-1980 • Looking at the ability to refine Heap storage algorithms • Ability to trace the heap using programs allows scientists to create heuristics and statistics to examine the heap • 1980-1990 • Redefinition of Heap structure to incorporate more higher-level data structures and Objects • Focus moves away from fine-tuning the Heap and the Stack to increasing OSes using different methodologies, namely increasing the density of hardware such that if the heap runs out, the OS can just request that more memory be allocated from RAM
New vs. Old Views • Old Views • Worry about present heap management • Considered to be a solved or an unsolvable problem • New Views • Look at the effect of the Heap over long periods of time • Computers are on significantly longer than 20 years ago and are running much larger programs • Determine scenarios in which the heap acts in certain ways to best be able to handle the heap in those scenarios • Don’t solve one BIG problem, solve a bunch of little ones
The Allocator • The process repsonsible for maintaining the Heap • Keeps track of: • Used Memory • Free Memory • Passes available blocks of memory to programs on the Stack where actual information can be stored • What the Allocator can NOT do: • Compact memory • Reallocate memory without a programs direct consent • Move data around to free up necessary space
Fragmentation • Allocator creates and deletes information wherever in the heap wherever enough free space exists causing Fragmentation • Internal Fragmentation • Occurs when the available block size is larger than the data needing to be assigned to that area • External Fragmentation • Occurs when free memory exists but cannot be used because object is larger than anyone particular block
Faults and Fixes • Majority of programs testing heap allocator are simulations that do not emulate real life • Results in false positives • Do not test long term processors, i.e. many OS processes • Can lead to a lot of fragmentation over long period of time • Allocators create/destroy at random • In Real Life Programs creation and destruction of data is not necessarily random • Most programs either: • Create Lots of Data Structures at the beginning of a program and manipulate said structures • Continuously create new structures and then destroy them but in groups
Methods of Testing the Heap • Tracing the Heap • Running Benchmark and simulations to create probabilities and statistics to fine-tune heap allocation • “A single death is a tragedy. A million deaths is a statistic” – Joseph Stalin • Relevant because it only goes to show that a statistic does not tell us what exactly is causing the problem • Theses Statistics are on short term programs cannot account for long term heap stability
Examination of Real Life Programs led to three distinct families of heap usage Enables developers to create general allocators that will be able to handle different types of programs in different ways It is more important to cover the manipulation of the heap at the peaks then in the troughs Example of Plateau Heap Allocation Perl Script Example of Ramp Heap Allocation Grobner Software Example of Peak Heap Allocation GCC Compiler Ramps, Peaks, and Plateaus
Strategies and Policies • Optimize allocations to minimize wait cycles • Authors believe that over a billion cycles are sacrificed and squandered hourly due to lack of efficiency in Allocation • Placement choice is key to determining the optimal location algorithm for data in the heap • Have to be worried about memory overhead • Also time overhead • “Put blocks where they won’t cause fragmentation later”
Splitting and Coalescing • Two main ways of manipulating space in the heap • Splitting: if a memory block is too big, split it to fit necessary data • Coalescing: if a memory block becomes empty and either of its neighbors are empty join both blocks together to provide larger block
Profiling of Allocators • Minimize Time overhead • Minimize Holes and Fragments • Exploitation of Common Patterns • Use of Splitting and Coalescence • Fits: when a block of a size is reused are block of the same size used preferentially • Splitting Thresholds
Taxonomy of Allocating Algorithms • Algorithms are classified by the mechanism they use for keeping track of areas: • Sequential Fits • Buddy Systems • Indexed Fits • Bitmapped Fits
Low-Level Tricks • Header Fields and Alignment • Used to store block related information • Typically the Size of block of memory • Relationship with neighboring blocks • In Use Bit • Typically add 10% overhead to memory usage • Boundary tags (aka Footer) • In Use Bit • Neighboring cells information • 10% overhead as well • Lookup Tables • Instead of indexing blocks by address, index by pages of blocks that are the same size
Sequential Fits • Best Fit • Find the smallest free block large enough to satisfy a request • An exhaustive search O(Mn) • If there are equal blocks which to use? • Good Memory Usage in the end • First Fit • Find the first block large enough to satisfy mem allocation • Question as to what order to look in memory • Next Fit • Optimization of First Fit: using roving pointer to keep track of last block used to cut down on execution time
Segregated Free Lists • Simple Segregated Storage • No splitting of free blocks is done • All heap broken down into size blocks of Power-two • All blocks of same size indexed by size and not by address • No Header overhead(all blocks same size) • Segregated Fits • Array of Free Lists • If block does not exist take blocks from other list and split or coalesce empty blocks to fit • Three Types: • Exact Lists • Strict Size with Rounding • Size Classes with Range Lists
Buddy System • Variant of segregated lists • Split entire heap into two • Split each half into two(does not need to be power of two) • And so on…if two blocks need to coalesce to make room requires that binary pair both be empty • Log(n) time • Can also be done using Self-Balancing Binary tree or Fibonacci sequence tree
Indexed Fits • More abstract algorithm that fits more to a policy • Policy being a rule set like all blocks at 8 bytes • One example uses a cartesian tree sorted by block size and address • Has log(n) search time
Bitmapped Fits • Bitmap used to record which parts of the heap are in us • Bitmap being a simple vector of 1-bit flags with one bit corresponding to each word of the heap area • Not used conventionally • On a 3% overhead vs 20% • Search times are linear but using heuristics can get it down to approximatly O(log N)
Heap Allocators for Multiprocessor Architectures • Things to consider: • Is there a Shared Cache? • Need a global heap that is accessible to all of the processors or an OS that can handle: • Contention • False Sharing • Space
Applying and Using this for Garbage Collection • The efficiency of the garbage collector is directly related to the efficiency of the allocator algorithm • Additionally if some of the low-level tricks are used it enables the garbage collector to even more effectively decipher what is used and what is not • Can also participate in maintaining the heap by coalescing neighboring blocks together to free up space
Conclusions • Heap organization is still an area worth researching • Because there is no defined organization to the heap it is not nearly as easy enough to exploit the heap as it is the stack • By optimizing the heap’s efficiency it is possible to speed up the average computer without having to increase the size of the hardware
Works Cited • Emery Berger. Scalable Memory Management. University of Massachusetts, Dept of Computer Science, 2002. • Wilson, Paul, Johnstone, Mark, Neely, Michael, and Boles, David. Dynamic Storage Allocation: A Survey and Critical Review. University of Texas at Austin, Department of Computer Science, Austin, TX. 1995. • Wilson, Paul. Uniprocessor Garbage Collection Techniques. ACM.