# Circuit-Level Floating Point Implementation

To understand the circuit-level operations happening during floating point computations, you need to understand the bit representation of floating point numbers (read it!).   If you're interested in why these values were chosen, there's a good floating point design rationale here.

We'll be using this code to show what's inside floats:

```struct float_bits {
unsigned int frac:23; // fraction bits, except for implicit leading 1
unsigned int exp:8; // exponent bits, biased by 127
unsigned int sign:1; // sign bit: 0 +, 1 -
};

union float_dissector {
float f; // the float
float_bits b; // its bits
};

// Convert this integer into a string of binary 1's and 0's.
std::string dump_bits(long value,long bitcount)
{
std::string ret="";
for (int bit=bitcount-1;bit>=0;bit--)
if ((1L<<bit)&value)
ret+="1";
else	ret+="0";
return ret;
}
// Show the contents of this float
void dump(float f) {
float_dissector ds; ds.f=f;
std::cout<<" float	"<<f<<
"	sign "<<ds.b.sign<<
"	exp "<<ds.b.exp-127<<
"	frac (1)."<<dump_bits(ds.b.frac,23)<<
std::endl;

}

void foo(void) {
dump(1.0);
dump(2.0);
dump(0.5);
dump(1.125);
dump(1.25);
dump(1.5);
dump(1.625);
dump(3.0);
dump(0.0);
}```

(Try this in NetRun now!)

The trick to doing floating-point addition is preconditioning the inputs: it's easy enough to add two numbers with the same sign and exponent fields--just integer add their fraction fields.  If they don't have the same exponent, you can shift the smaller number down to line up with the bigger number.  If they don't have the same sign, you're really doing subtraction, not addition.

```// Add two floats, without touching the floating point hardware
float_dissector bd; bd.f=b;

// Precondition the inputs
if (a<b) return add(b,a); // swap so a>=b
if (a<0.0) return -add(-a,-b); // crude handling of negative numbers
// if (b<0.0) return sub(a,-b); // FIXME: need subtract for negative b

// Now a and b are non-negative, with a>=b
CHATTY(	dump(a); dump(b); )
unsigned long afrac=(1<<23)+ad.bits.frac; // include the implicit 1
unsigned long bfrac=(1<<23)+bd.bits.frac;
int expshift=ad.bits.exp - bd.bits.exp; // distance between exponents
bfrac=bfrac>>expshift; // line up b with a's exponent (FIXME: rounding?)
CHATTY(	std::cout<<"Exponent shift "<<expshift<<" bit\n"; )

// Now that the fraction fields are aligned, do integer addition
unsigned long sfrac = afrac + bfrac;
float_dissector sd; // sum
sd.bits.sign=0; // positive result
if (sfrac&(1<<24)) { // carry!
CHATTY(		std::cout<<"Carry!\n";  )
sd.bits.frac=sfrac>>1; // lose precision (rounding mode?)
} else { // no carry, use a's exponent in output
CHATTY(		std::cout<<"No carry\n"; )
sd.bits.frac=sfrac; // exact result
}
CHATTY(	std::cout<<" sum: "<<sd.f<<"\n\n"; )
return sd.f;
}

void foo(void) {