To represent fractional values, prior to fast floating-point hardware we used to use fixed point arithmetic, where you keep track of the decimal point's location at compile time. Floating point allows the decimal point to move at runtime, making it more flexible than fixed point.

If you need a review of floats, see these CS 301 lecture notes:

- 11_12_fp_speed,
Speed of Floating-Point Operations and Weird Floats

- 11_07_fp_bits,
Bits used to Implement Floating-Point Numbers

- 11_05_floating_point,
Floating-Point Arithmetic

- Shift the two input numbers so their decimal points line up.
- Add the shifted numbers.
- Renormalize the sum: count off zero bits until you hit a one, and shift significant digits up.

fldpi ; Push "pi" onto floating-point stackx86 newer (1990's) interface: SSE registers

fld DWORD[my_float] ; push constant

faddp ; add one and pi

sub esp,8 ; Make room on the stack for an 8-byte double

fstp QWORD [esp]; Push printf's double parameter onto the stack

push my_string ; Push printf's string parameter (below)

extern printf

call printf ; Print string

add esp,12 ; Clean up stack

ret ; Done with function

my_string: db "Yo! Here's our float: %f",0xa,0

my_float: dd 1.0 ; floating-point DWORD

movups xmm0,[my_arr] ; load up array

addps xmm0,xmm0 ; add array to itself

movups [my_arr],xmm0 ; store back to memory

push 4 ; number of values to print

push my_arr ; array to print

extern farray_print

call farray_print ; Print string

add esp,8 ; Clean up stack

ret ; Done with function

section ".data"

my_arr: dd 1.0, 2.0, 3.0, 4.0 ; floating-point DWORD

float x=1.0+1.0e-9*(rand()%2); /* FEAR ME, OPTIMIZER!!! */We can't do the cout on the GPU, but we can print out different colors. Green is right on the money; blue is too small, red is too big.

int itcount=0;

while (x+1.0f!=1.0f) {

x=x*0.5;

itcount++;

}

std::cout<<"itcount=="<<itcount<<"\n";

float x=1.0+texcoords.x*1.0e-10; /* FEAR ME, OPTIMIZER!!! */This outputs green, indicating that "float" on the GPU has exactly the same number of bits (23) as on the CPU.

int itcount=0;

while (x+1.0!=1.0) {

x=x*0.5;

itcount++;

}

if (itcount>24) gl_FragColor=vec4(1,0,0,0); /* red */

else if (itcount==24) gl_FragColor=vec4(0,1,0,0); /* green */

else if (itcount<24) gl_FragColor=vec4(0,0,1,0); /* blue */

Similarly, we can keep shrinking an "x" until it gets rounded off to zero. We use up all the exponent values (128), then all the mantissa values (23) as denormals to get a total of 150 places: 2^-150 "underflows" to zero.

float x=1.0+1.0e-9*(rand()%2); /* FEAR ME, OPTIMIZER!!! */Now on the GPU:

int itcount=0;

while (x!=0.0f) {

x=x*0.5;

itcount++;

}

std::cout<<"itcount=="<<itcount<<"\n";

float x=1.0+texcoords.x*1.0e-10; /* FEAR ME, OPTIMIZER!!! */Wait--the GPU is returning blue, indicating fewer iterations than we expected! Experimentally, we can find that the GPU hits underflow at 2^-127. Wait, that's the last of the normal numbers before denormals kick in. So the GPU doesn't do denormals, and indeed you can measure that there is zero performance penalty for using values near them.

int itcount=0;

while (x!=0.0) {

x=x*0.5;

itcount++;

}

int expected=150;

if (itcount>expected) gl_FragColor=vec4(1,0,0,0); /* red */

else if (itcount==expected) gl_FragColor=vec4(0,1,0,0); /* green */

else if (itcount<expected) gl_FragColor=vec4(0,0,1,0); /* blue */