Malloc Performance Details
CS 301:
Assembly Language Programming Lecture, Dr. Lawlor
Dynamic allocation via new or malloc is a performance-limiting
factor in many languages, including C++ and JavaScript. What's
inside malloc?
We can run some experiments in C++ to find out. This program
dumps malloc's magic length field, which starts right *before* the
pointer it returns to you, at array index [-1].
/* Dump malloc's special index [-1] data for different byte sizes: */
for (int size=0;size<50;size++) {
char *buf=(char *)malloc(size); // <- buffer of size bytes
long *magic=(long *)buf; // read like long
std::cout<<"Buffer of "<<size<<" bytes: "<<magic[-1]<<" at pointer "<<(long)magic<<"\n";
}
(Try
this in NetRun now!)
On my 64-bit Linux machine, this returns:
Buffer of 0 bytes: 33 at pointer 6348832 ; pointer is moving by 32 bytes each time
Buffer of 1 bytes: 33 at pointer 6348864
Buffer of 2 bytes: 33 at pointer 6348896
Buffer of 3 bytes: 33 at pointer 6348928
Buffer of 4 bytes: 33 at pointer 6348960
Buffer of 5 bytes: 33 at pointer 6348992
Buffer of 6 bytes: 33 at pointer 6349024
Buffer of 7 bytes: 33 at pointer 6349056
Buffer of 8 bytes: 33 at pointer 6349088
Buffer of 9 bytes: 33 at pointer 6349120
Buffer of 10 bytes: 33 at pointer 6349152
Buffer of 11 bytes: 33 at pointer 6349184
Buffer of 12 bytes: 33 at pointer 6349216
Buffer of 13 bytes: 33 at pointer 6349248
Buffer of 14 bytes: 33 at pointer 6349280
Buffer of 15 bytes: 33 at pointer 6349312
Buffer of 16 bytes: 33 at pointer 6349344
Buffer of 17 bytes: 33 at pointer 6349376
Buffer of 18 bytes: 33 at pointer 6349408
Buffer of 19 bytes: 33 at pointer 6349440
Buffer of 20 bytes: 33 at pointer 6349472
Buffer of 21 bytes: 33 at pointer 6349504
Buffer of 22 bytes: 33 at pointer 6349536
Buffer of 23 bytes: 33 at pointer 6349568
Buffer of 24 bytes: 33 at pointer 6349600
Buffer of 25 bytes: 49 at pointer 6349632 ; 25 bytes for you, 8 bytes for malloc > 32
Buffer of 26 bytes: 49 at pointer 6349680
Buffer of 27 bytes: 49 at pointer 6349728
Buffer of 28 bytes: 49 at pointer 6349776
Buffer of 29 bytes: 49 at pointer 6349824
Buffer of 30 bytes: 49 at pointer 6349872
Buffer of 31 bytes: 49 at pointer 6349920
Buffer of 32 bytes: 49 at pointer 6349968
Buffer of 33 bytes: 49 at pointer 6350016
Buffer of 34 bytes: 49 at pointer 6350064
Buffer of 35 bytes: 49 at pointer 6350112
Buffer of 36 bytes: 49 at pointer 6350160
Buffer of 37 bytes: 49 at pointer 6350208
Buffer of 38 bytes: 49 at pointer 6350256
Buffer of 39 bytes: 49 at pointer 6350304
Buffer of 40 bytes: 49 at pointer 6350352
Buffer of 41 bytes: 65 at pointer 6350400 ; length changes by multiple of 16 bytes
Buffer of 42 bytes: 65 at pointer 6350464
Buffer of 43 bytes: 65 at pointer 6350528
Buffer of 44 bytes: 65 at pointer 6350592
Buffer of 45 bytes: 65 at pointer 6350656
Buffer of 46 bytes: 65 at pointer 6350720
Buffer of 47 bytes: 65 at pointer 6350784
Buffer of 48 bytes: 65 at pointer 6350848
Buffer of 49 bytes: 65 at pointer 6350912
We can immediately see:
- Malloc rounds up your allocation size to at least 32
bytes. This leaves space for 24 bytes of your data, plus 8
bytes for malloc's magic to keep track of the buffer length.
- Longer allocations are rounded to the nearest multiple of 16
bytes.
- The magic length field's low bit always seems to be set.
If we free the previous buffer, this low bit is cleared.
This is only one possible implementation--I've seen several
different implementations on different machines. For example,
OS X at one point always rounded up the buffer size to a power of
two.
(See more details
of malloc internals, although this touches on lots of things
we haven't talked about.)