Instruction usage breakdown (by popularity):
42.4% mov instructions
5.0% lea instructions
4.9% cmp instructions
4.7% call instructions
4.5% je instructions
4.4% add instructions
4.3% test instructions
4.3% nop instructions
3.7% jmp instructions
2.9% jne instructions
2.9% pop instructions
2.6% sub instructions
2.2% push instructions
1.4% movzx instructions
1.3% ret instructions
...
This makes a little more sense broken into categories:
Load and store: about 50% totalSo for this piece of code, the most numerically common instructions on x86 are actually just memory loads and stores (mov, push, or pop), followed by branches, and finally arithmetic--this low arithmetic density was a surprise to me! You can get a little more detail by looking at what stuff occurs in each instruction:
42.4% mov instructions
2.9% pop instructions
2.2% push instructions
1.4% movzx instructions
0.3% xchg instructions
0.2% movsx instructions
Branch: about 25% total
4.9% cmp instructions
4.7% call instructions
4.5% je instructions
4.3% test instructions
3.7% jmp instructions
2.9% jne instructions
1.3% ret instructions
0.4% jle instructions
0.4% ja instructions
0.4% jae instructions
0.3% jbe instructions
0.3% js instructions
Arithmetic: about 15% total
5.0% lea instructions (uses address calculation arithmetic)
4.4% add instructions
2.6% sub instructions
1.0% and instructions
0.5% or instructions
0.3% shl instructions
0.3% shr instructions
0.2% sar instructions
0.1% imul instructions
Registers used:So the "typical" x86 instruction would be an int-sized load or store between a register, often eax, and a memory location, often something on the stack referenced by ebp. Something like 50% of instructions are indeed of this form!
30.9% "eax" lines (eax is the return result register, and general scratch)
5.7% "ebx" lines (this register is only used for accessing globals inside DLL code)
10.3% "ecx" lines
15.5% "edx" lines
11.7% "esp" lines (note that "push" and "pop" implicitly change esp, so this should be about 5% higher)
25.9% "ebp" lines (the bread-and-butter stack access base register)
12.0% "esi" lines
8.6% "edi" lines
Features used:
66.0% "0x" lines (immediate-mode constants)
69.6% "," lines (two-operand instructions)
36.7% "+" lines (address calculated as sum)
1.2% "*" lines (address calculated with scaled displacement)
48.1% "\[" lines (explicit memory accesses)
2.8% "BYTE PTR" lines (char-sized memory access)
0.4% "WORD PTR" lines (short-sized memory access)
40.7% "DWORD PTR" lines (int or float-sized memory)
0.1% "QWORD PTR" lines (double-sized memory)
#!/bin/sh
file="$1"
d="dis.txt"
objdump -drC -M intel "$file" | \
awk -F: '{print substr($2,24);}' | \
grep -v "^$" > "$d"
tot=`wc -l $d | awk '{print $1}'`
echo "$tot instructions total"
echo "Instruction usage breakdown:"
sort $d | awk '{
if ($1==last) {count++;}
else {print count, last; count=0; last=$1;}
}' | \
sort -n -r | \
awk '{printf(" %.1f%% %s instructions\n",$1*100.0/'$tot',$2);}' \
> dis_instructions.txt
head -15 dis_instructions.txt
echo "Register and feature usage:"
for reg in eax ebx ecx edx esp ebp esi edi \
"0x" "," "+" "*" "\[" \
"BYTE PTR" "[^D]WORD PTR" "DWORD PTR" "QWORD PTR"
do
c=`grep "$reg" "$d" | wc -l | awk '{print $1}'`
echo | awk '{printf(" %.1f%% \"'"$reg"'\" lines\n",'$c'*100.0/'$tot');}'
done