Malloc Performance Details

CS 301: Assembly Language Programming Lecture, Dr. Lawlor

Dynamic allocation via new or malloc is a performance-limiting factor in many languages, including C++ and JavaScript.  What's inside malloc?

We can run some experiments in C++ to find out.  This program dumps malloc's magic length field, which starts right *before* the pointer it returns to you, at array index [-1].
/* Dump malloc's special index [-1] data for different byte sizes: */
for (int size=0;size<50;size++) {
	char *buf=(char *)malloc(size); // <- buffer of size bytes
	long *magic=(long *)buf; // read like long
	std::cout<<"Buffer of "<<size<<" bytes: "<<magic[-1]<<" at pointer "<<(long)magic<<"\n";
}

(Try this in NetRun now!)

On my 64-bit Linux machine, this returns:
Buffer of 0 bytes: 33 at pointer 6348832 ; pointer is moving by 32 bytes each time
Buffer of 1 bytes: 33 at pointer 6348864
Buffer of 2 bytes: 33 at pointer 6348896
Buffer of 3 bytes: 33 at pointer 6348928
Buffer of 4 bytes: 33 at pointer 6348960
Buffer of 5 bytes: 33 at pointer 6348992
Buffer of 6 bytes: 33 at pointer 6349024
Buffer of 7 bytes: 33 at pointer 6349056
Buffer of 8 bytes: 33 at pointer 6349088
Buffer of 9 bytes: 33 at pointer 6349120
Buffer of 10 bytes: 33 at pointer 6349152
Buffer of 11 bytes: 33 at pointer 6349184
Buffer of 12 bytes: 33 at pointer 6349216
Buffer of 13 bytes: 33 at pointer 6349248
Buffer of 14 bytes: 33 at pointer 6349280
Buffer of 15 bytes: 33 at pointer 6349312
Buffer of 16 bytes: 33 at pointer 6349344
Buffer of 17 bytes: 33 at pointer 6349376
Buffer of 18 bytes: 33 at pointer 6349408
Buffer of 19 bytes: 33 at pointer 6349440
Buffer of 20 bytes: 33 at pointer 6349472
Buffer of 21 bytes: 33 at pointer 6349504
Buffer of 22 bytes: 33 at pointer 6349536
Buffer of 23 bytes: 33 at pointer 6349568
Buffer of 24 bytes: 33 at pointer 6349600
Buffer of 25 bytes: 49 at pointer 6349632 ; 25 bytes for you, 8 bytes for malloc > 32
Buffer of 26 bytes: 49 at pointer 6349680
Buffer of 27 bytes: 49 at pointer 6349728
Buffer of 28 bytes: 49 at pointer 6349776
Buffer of 29 bytes: 49 at pointer 6349824
Buffer of 30 bytes: 49 at pointer 6349872
Buffer of 31 bytes: 49 at pointer 6349920
Buffer of 32 bytes: 49 at pointer 6349968
Buffer of 33 bytes: 49 at pointer 6350016
Buffer of 34 bytes: 49 at pointer 6350064
Buffer of 35 bytes: 49 at pointer 6350112
Buffer of 36 bytes: 49 at pointer 6350160
Buffer of 37 bytes: 49 at pointer 6350208
Buffer of 38 bytes: 49 at pointer 6350256
Buffer of 39 bytes: 49 at pointer 6350304
Buffer of 40 bytes: 49 at pointer 6350352
Buffer of 41 bytes: 65 at pointer 6350400 ; length changes by multiple of 16 bytes
Buffer of 42 bytes: 65 at pointer 6350464
Buffer of 43 bytes: 65 at pointer 6350528
Buffer of 44 bytes: 65 at pointer 6350592
Buffer of 45 bytes: 65 at pointer 6350656
Buffer of 46 bytes: 65 at pointer 6350720
Buffer of 47 bytes: 65 at pointer 6350784
Buffer of 48 bytes: 65 at pointer 6350848
Buffer of 49 bytes: 65 at pointer 6350912
We can immediately see:
This is only one possible implementation--I've seen several different implementations on different machines.  For example, OS X at one point always rounded up the buffer size to a power of two.

(See more details of malloc internals, although this touches on lots of things we haven't talked about.)