Memory Allocation

CS 321 Lecture, Dr. Lawlor, 2006/02/24

So now we know how to ask the OS for raw pages of memory.  How do you break those pages up into usable chunks?  You use a smaller-than-page-granularity allocator like "new" (the C++ allocator) or "malloc" (the C allocator).

Summary

When you ask for n bytes, the memory allocator has to use more than n bytes of memory because:
The "housekeeping" data associated with an allocated piece of memory is invariably stored right before the start of the memory, like this:
stuff
size: 8+n+1
<- User returned pointer here.                            -- n bytes of user data go here --

So on almost every 32-bit machine (Linux, Windows, MacOS), this code prints the size of an allocated block of memory:
int getBlockSize(void *block) {
int *p=(int *)block;
int *sz=p-1; /* points to the "size" field */
return (*sz - 8)&~1; /* subtract off the 8-byte header, and mask off the "present" flag */
}
int foo(void) {
int *buf=new int[1234];
return getBlockSize(buf);
}

On Windows, the "stuff" field above seems to store the size of the *next* block.  E.g., after allocating blocks of size 0xa0, 0xb0, 0xc0, ....
                                  stuff     size
p[0]- 407D1B50: 73737373 000000b1 000000a1 p->73737373 73737373 73737373
p[1]- 407D1AA0: 73737373 000000c1 000000b1 p->73737373 73737373 73737373
p[2]- 407D19E0: 73737373 000000d1 000000c1 p->73737373 73737373 73737373
p[3]- 407D1910: 73737373 000000e1 000000d1 p->73737373 73737373 73737373
p[4]- 407D1830: 73737373 000000f1 000000e1 p->73737373 73737373 73737373
p[5]- 407D1740: 00000000 00000730 000000f1 p->73737373 73737373 73737373

On Linux, the "stuff" field above seems to store the size of the *previous* block, but only if the previous block is empty.  The same picture above is now:
                                   stuff     size
p[0]- 0x804a008: 00000000 00000000 000000a1 p->73737373 73737373 73737373
p[1]- 0x804a0a8: 73737373 00000000 000000b1 p->73737373 73737373 73737373
p[2]- 0x804a158: 73737373 00000000 000000c1 p->73737373 73737373 73737373
p[3]- 0x804a218: 73737373 00000000 000000d1 p->73737373 73737373 73737373
p[4]- 0x804a2e8: 73737373 00000000 000000e1 p->73737373 73737373 73737373
p[5]- 0x804a3c8: 73737373 00000000 000000f1 p->73737373 73737373 73737373

Gory Linux Details

The Linux glibc malloc (see glibc-2.3.4/malloc/malloc.c line 1700) has this big comment:

    Chunks of memory are maintained using a `boundary tag' method as
    described in e.g., Knuth or Standish.  (See the paper by Paul
    Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
    survey of such techniques.)  Sizes of free chunks are stored both
    in the front of each chunk and at the end.  This makes
    consolidating fragmented chunks into bigger chunks very fast.  The
    size fields also hold bits representing whether chunks are free or
    in use.

    An allocated chunk of memory looks like this:
    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if NOT allocated |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_space() bytes) .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    Where "chunk" is the front of the chunk for the purpose of most of
    the malloc code, but "mem" is the pointer that is returned to the
    user.  "Nextchunk" is the beginning of the next contiguous chunk.

    Chunks always begin on even word boundries, so the mem portion
    (which is returned to the user) is also on an even word boundary, and
    thus at least double-word aligned.

The 'M' bit indicates that this block is allocated directly using mmap().
The 'P' bit indicates that the previous chunk is: 1 present and allocated; or 0 free.

    Free chunks are stored in circular doubly-linked lists, and look like this:
    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`head:' | Size of chunk, in bytes |P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Forward pointer to next chunk in free list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Back pointer to previous chunk in free list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused space (may be 0 bytes long) .
. .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`foot:' | Size of chunk, in bytes |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    The P (PREV_INUSE) bit, stored in the unused low-order bit of the
    chunk size (which is always a multiple of two words), is an in-use
    bit for the *previous* chunk.  If that bit is *clear*, then the
    word before the current chunk size contains the previous chunk
    size, and can be used to find the front of the previous chunk.
    The very first chunk allocated always has this bit set,
    preventing access to non-existent (or non-owned) memory. If
    prev_inuse is set for any given chunk, then you CANNOT determine
    the size of the previous chunk, and might even get a memory
    addressing fault when trying to do so.