Memory Allocation and Array Access

Allocating Space for an Array

There are many ways to allocate space for an array in C or C++:

int *arr=new int[n]; /* C++ "new" operator. Doesn't exist in plain old C. */
int *arr=(int *)malloc(n*sizeof(int)); /* C/C++ "malloc" routine-- allocates given number of bytes. */
int arr[n]; /* C/C++ local variable--actually allocates space on the stack. Except on gcc/g++, "n" must be a constant! */

There are also many ways to do this in assembly. The most common way is to call malloc and free:

push 4*10; Number of bytes to allocate
extern malloc
call malloc
add esp,4; Undo "push"
; Malloc's return value, a pointer, comes back in eax.
... ; use eax as array here
; Eventually, you'll need to call "free" to release that memory:
push eax
extern free
call free
add esp,4; Undo "push"

You can also allocate space on the stack, by just moving the stack pointer down to make room:

sub esp,4*10; Make space on the stack
mov eax, esp; Point eax to the start of that space
... ; Use eax as array here
add esp,4*10; Restore stack

You can allocate space in the initialized (".data") or uninitialized (".bss") section of the executable file:

mov eax,thingy; Point eax to thingy's data
...

section .bss; <- Uninitialized data area
thingy:
	resb 4*10; Reserve this many bytes of space here

Any of these will work, and the pointer you end up with is just plain old memory. But there are tradeoffs in deciding which memory allocation scheme to use. Malloc is slow and painful to use, but offers the most possible memory (gigabytes). The stack is very fast, but must be given back at the end of your subroutine, and is limited to a few megabytes on most machines. Initialized or uninitialized data is a fixed size, and always allocated; this is a waste of space unless you're storing small objects of a few kilobytes.

Accessing an Array Element

Array element i is stored starting at "i*sizeof(element)" bytes away from the start of the array. Hence in C/C++, if you really wanted to write nasty code, you could actually access element i of an int array (that is, do "x=arr[i]") like this:

    unsigned char *c=(unsigned char *)arr; // Make char pointer out of int array pointer
    x = *(int *)(c + i*sizeof(int)); // Move down by i*sizeof(int) bytes, so we're pointing at arr[i], and load up that value as an int.

In assembly, if edi points to the start of an array, you can load up arr[2] into esi like this:
    mov esi, [edi+2*4];
since 4 is "sizeof(int)".    You can load up the ecx'th element of the array like this:
    mov esi, [edi + ecx*4];

Remember,
    mov esi, edi
copies edi itself into esi, like "x=arr"; while
    mov esi, [edi]
copies what edi points to into esi, like "x=*arr;".

Sizes of Elements

32-bit x86 (little endian)

32-bit PowerPC (big endian)

64-bit x86

Java / C#

sizeof(char)==1
sizeof(short)==2
sizeof(int)==4
sizeof(long)==4
sizeof(long long)==8
sizeof(void *)==4
sizeof(float)==4
sizeof(double)==8
sizeof(long double)==12

sizeof(char)==1
sizeof(short)==2
sizeof(int)==4
sizeof(long)==4
sizeof(long long)==8
sizeof(void *)==4
sizeof(float)==4
sizeof(double)==8
sizeof(long double)==8

sizeof(char)==1
sizeof(short)==2
sizeof(int)==4
sizeof(long)==8
sizeof(long long)==8
sizeof(void *)==8
sizeof(float)==4
sizeof(double)==8
sizeof(long double)==16

sizeof(byte)==1
sizeof(short)==2
sizeof(int)==4
sizeof(long)==8
 /* no need for long long */
 /* no pointers in Java */
sizeof(float)==4
sizeof(double)==8
 /* no long double in Java */
sizeof(Char)==2

Note the deciding difference between "32 bit machines" and "64 bit machines" is the size of a pointer--4 or 8 bytes. "int" is 4 bytes on all modern machines. "long" is 8 bytes in Java and on 64-bit machines, and just 4 bytes on 32-bit machines.