The basic idea with machine code is to use binary bytes to represent a computation. Different machines use different bytes, but Intel x86 machines use "0xc3" to represent the "ret" instruction, and "0xb8" to represent the "load a 32-bit constant into eax" instruction.
0: b8 05 00 00 00 mov eax,0x5 5: c3 ret"mov" is an instruction, encoded with the operation code or "opcode" 0xb8. Since mov takes an argument, the next 4 bytes are the constant to move into eax.
(Try this in NetRun now!)
0: b8 05 00 00 00 mov eax,0x5 5: b9 05 00 00 00 mov ecx,0x5 a: ba 05 00 00 00 mov edx,0x5x86 register numbering is a bit bizarre:
Number |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Int
Register |
eax |
ecx |
edx |
ebx |
esp |
ebp |
esi |
edi |
call lilfunc ret lilfunc: xor eax,eax ret
This means you
can just as easily copy the address of a function to a register as
any other value:
mov rcx,lilfunc ; rcx = address of lilfunc call rcx retIn C++, the same concept exists, but you need the special ugly "function pointer" syntax.
int bar(void) { return 3; } int foo(void) { typedef int (*fnptr)(void); // fnptr: returns int, parameters void fnptr f=(fnptr)bar; // f points to bar return f(); // calls bar }
Because a
pointer to a function is just a pointer to bytes, you can manually
declare bytes of machine code, and then run them:
const unsigned char bytes[]={ 0x33,0xC0, // xor eax,eax 0xc3 // ret }; int foo(void) { typedef int (*fnptr)(void); // fnptr: returns int, parameters void fnptr f=(fnptr)bytes; // f points to bytes return f(); // run the bytes }
x86 specifies register sizes using prefix bytes. For
example, the same "0xb8" instruction that loads a 32-bit constant
into eax can be used with a "0x66" prefix to load a 16-bit
constant, or a "0x48" REX
prefix to load a 64-bit constant.
Here we're loading the same constant 0x12 into all the different
sizes of eax:
0: 48 b8 12 00 00 00 00 00 00 00 mov rax,0x12 a: b8 12 00 00 00 mov eax,0x12 f: 66 b8 12 00 mov ax,0x12 13: b0 12 mov al,0x12 15: c3 ret
asm |
machine code |
Description |
add |
0x03 ModR/M |
Add one 32-bit register to another. |
mov |
0x8B ModR/M |
Move one 32-bit register to another. |
mov |
0xB8 DWORD |
Move a 32-bit constant into register eax. |
ret |
0xc3 |
Returns from current function. |
xor |
0x33 ModR/M |
XOR one 32-bit register with another. |
xor |
0x34 BYTE |
XOR register al with this 8-bit constant. |
mod |
reg/opcode |
r/m |
2 bits, selects memory or
register access mode: 0: memory at register r/m 1: memory at register r/m+byte offset 2: memory at register r/m + 32-bit offset 3: register r/m itself (not memory) |
3 bits, usually a destination
register number. For some instructions, this is actually extra opcode bits. |
3 bits, usually a source
register number. Treated as a pointer for mod!=3, treated as an ordinary register for mod==3. If r/m==4, indicates the real memory source is a SIB byte. |
r32(/r) reg= |
EAX 000 |
ECX 001 |
EDX 010 |
EBX 011 |
ESP 100 |
EBP 101 |
ESI 110 |
EDI 111 |
||
effective address | mod | R/M | value of mod R/M byte (hex) | |||||||
[RAX] [RCX] [RDX] [RBX] [SIB] [RIP + DWORD] [RSI] [RDI] |
00 |
000 001 010 011 100 101 110 111 |
00 01 02 03 04 05 06 07 |
08 09 0A 0B 0C 0D 0E 0F |
10 11 12 13 14 15 16 17 |
18 19 1A 1B 1C 1D 1E 1F |
20 21 22 23 24 25 26 27 |
28 29 2A 2B 2C 2D 2E 2F |
30 31 32 33 34 35 36 37 |
38 39 3A 3B 3C 3D 3E 3F |
[RAX + BYTE] [RCX + BYTE] [RDX + BYTE] [RBX + BYTE] [SIB + BYTE] [RBP + BYTE] [RSI + BYTE] [RDI + BYTE] |
01 |
000 001 010 011 100 101 110 111 |
40 41 42 43 44 45 46 47 |
48 49 4A 4B 4C 4D 4E 4F |
50 51 52 53 54 55 56 57 |
58 59 5A 5B 5C 5D 5E 5F |
60 61 62 63 64 65 66 67 |
68 69 6A 6B 6C 6D 6E 6F |
70 71 72 73 74 75 76 77 |
78 79 7A 7B 7C 7D 7E 7F |
[RAX + DWORD] [RCX + DWORD] [RDX + DWORD] [RBX + DWORD] [SIB + DWORD] [RBP + DWORD] [RSI + DWORD] [RDI + DWORD] |
10 |
000 001 010 011 100 101 110 111 |
80 81 82 83 84 85 86 87 |
88 89 8A 8B 8C 8D 8E 8F |
90 91 92 93 94 95 96 97 |
98 99 9A 9B 9C 9D 9E 9F |
A0 A1 A2 A3 A4 A5 A6 A7 |
A8 A9 AA AB AC AD AE AF |
B0 B1 B2 B3 B4 B5 B6 B7 |
B8 B9 BA BB BC BD BE BF |
EAX ECX EDX EBX ESP EBP ESI EDI |
11 |
000 001 010 011 100 101 110 111 |
C0 C1 C2 C3 C4 C5 C6 C7 |
C8 C9 CA CB CC CD CE CF |
D0 D1 D2 D3 D4 D5 D6 D7 |
D8 D9 DA DB DC DD DE DF |
E0 E1 E2 E3 E4 E5 E6 E7 |
E8 E9 EA EB EC ED EE EF |
F0 F1 F2 F3 F4 F5 F6 F7 |
F8 F9 FA FB FC FD FE FF |