CS 321 Spring 2013 > Lecture Notes for Wednesday, February 27, 2013

CS 321 Spring 2013
Lecture Notes for Wednesday, February 27, 2013

Dynamic Allocation

Introduction

Dynamic allocation refers to memory allocation done by a program while it is running: a C++ new or a C malloc. The low-level memory-abstraction hardware is generally not directly involved. Rather, each process has a large area in its address space, called its heap, and this is used for dynamic allocation. Note: This usage of the word “heap” is unrelated to the data structure known as a (Binary) Heap.

With modern virtual memory, we could use the VM functionality directly to do dynamic allocation, but this would be inefficient; it would also lack the fine-grained control we prefer.

Since the hardware-level memory handling is not directly involved in dynamic allocation, there is no protection of process’s memory against access by a different portion of the program. For example, the C++ “private” directive does not actually prohibit access to the memory holding certain class members; it merely makes the identifiers (names) for those members unusable at compile time by code outside the class.

When writing code that supports dynamic allocation, there are two main issues.

Where in the heap do we allocate?
What data structures are used to store the structure of the heap?

Where to Allocate

At any point in the execution of a program, its heap consists of allocated blocks that may be scattered throughout the heap. The spaces between the allocated blocks, which are available for new allocations, are free blocks, or holes.

When an allocation is requested, we need to determine which hole to use. A number of strategies have been devised. We will say a hole is adequate if it is large enough to hold a requested allocation. Note that there may be no adequate hole. In this case, we must either enlarge the heap or flag an error.

First Fit: Use the first adequate hole.
Next Fit: Starting at the previous allocation, use the first adequate. Wrap around to the beginning if the end is reached. This has slightly worse performance than First Fit.
Best Fit: Use the smallest adequate hole. This is slower than First Fit, and actually has more wasted memory; First Fit generally creates fewer, larger holes.
Worst Fit: Use the largest adequate hole.; Performs poorly. All holes tend to shrink at about the same rate; after a while, many small holes.
Quick Fit: A special-purpose strategy when there are many allocations at fixed sizes.

Structure of a Heap

Now we discuss the kinds of data structures used to store a process’s heap. Again, our usage of “heap” here has no connection to the data structure called a (Binary) Heap, which is used in the Heapsort algorithm, etc.

There are three requirements that a Heap data structure must meet.

It must be able to determine an allocated block’s size, given only the address where it begins. (Note that C++ delete and C free each have just one parameter: a pointer.)
It must keep track of free blocks and allow us to find them.
It must be able to merge adjacent free blocks into a single free block.

And of course we want to minimize use of time and space.

The first requirement is easily met. We store the size of an allocated block just before the start of the block. This means that, when we do an allocation, we grab just a few bytes more than we need, and return a pointer a few bytes past the beginning the memory we grabbed.

The second requirement can be met in one of two ways: a bitmap or a linked list.

First, we can use a bitmap. We divide memory into chunks of some fixed size, and have essentially an array of bits, one for each chunk, indicating whether it is allocated or not. Say the bit is 1 for allocated and 0 for free. Then we can find adequate holes by looking for a certain number of contiguous zeroes. Note that the third requirement is automatically met using this structure.

A bitmap works well for small memory sizes, but it does not scale well; modern systems will generally not use it.

Second, we can make a linked list out of the free blocks. Each free block can have a pointer to the next free block in the list. Since the block is free, the application is not using it; we can simply store the pointer at the start of the block. This allows for fast insertion of free blocks at the beginning of the list, fast removal of a free block when we allocate, and fast traversal of the list, looking for an adequate hole. Most modern systems will use this method.

If we place free blocks in a linked list, then we need additional data to help us merge adjacent free blocks. We need to be able to find the next and previous block and determine whether these are free. We can already find the next block, since the size of our block is stored just before it (see above). An indication of whether a block is free needs to be stored somewhere—perhaps just before the block, in the small area that also contains the block’s size. And then we need some indication of where the previous block starts: either its size or a pointer to its start is sufficient. This can be stored in that same small area, or inside our block, since it is free.

Looking into Memory

We looked at the data structures used to handle the heap on my computer. We found that a block’s size is stored just before the block, free blocks are made into a linked list, and there is sufficient data to allow for merging of adjacent free blocks.

We also found that the First Fit strategy is generally used, and adjacent free blocks are merged, although some aspects of this are avoided for small allocations.