CF--The "carry flag". Set to indicate the bit that carries out of an addition or subtraction. For signed numbers, this doesn't really indicate a problem, but for unsigned numbers, this carry indicates an overflow. Can be used by the "jc" (jump if carry flag is set) instruction. Set by all the arithmetic instructions. Can be added into another arithmetic operation with "adc" (add with carry), which computes:

output register = output register + input register + CF

new CF = carry out of the above addition

For example, you can preserve the bit overflowing out of a big add (between ecx and ecx) like this:

mov ecx, 0x8000ff00

add ecx, ecx

mov eax,0

adc eax,eax ; Adds eax, eax, and the carry flag together

Here's a C++ example:

volatile int lo,hi;"adc" is used in the compiler's implementation of the 64-bit "long long" datatype on 32-bit machines.

int foo(void) {

lo=0xffffffff;

hi=0xabc;

__asm__(" add $1,lo \n");

__asm__(" adc $0,hi \n");

std::cout<<"lo="<<std::hex<<lo<<"\n";

std::cout<<"hi="<<std::hex<<hi<<"\n";

return 0;

}

long long v;On a 64-bit machine, this is just an "add". On a 32-bit machine, it's an add followed by an add with carry.

void add_to_v(void) {

v+=3;

}

000000c4 <add_to_v()>:

c4: 83 05 00 00 00 00 03 add DWORD PTR ds:0x0,0x3

c6: R_386_32 v

cb: 83 15 04 00 00 00 00 adc DWORD PTR ds:0x4,0x0

cd: R_386_32 v

d2: c3 ret

add loA,loB

adc hiA,hiB

This way, the carry bit coming out of the low addition would be read by the higher addition.

To do this "for real", there are several excellent libraries, such as GMP.

The obvious way to do this is really quite slow for big values of y:

int prod=1;

for (int i=1;i<=y;i++) prod*=x;

return prod;

But it's *exponentially* faster to compute x raised to the powers of two by repeated squaring, then combine those powers of two to get x to the y. For example, you can compute x to the 16th by squaring x four times, like so:

int x2=x*x;

int x4=x2*x2;

int x8=x4*x4;

int x16=x8*x8;

Harkening back to our bitwise operators, we can just decompose y into the corresponding powers-of-two of x, by looking at the bits of y:

/* Return x raised to the power y */

double mypow(double x,int y)

{

double prod=1; /* will hold x to the y power */

double xpow=x; /* will take powers of x */

for (unsigned int bit=0;bit<sizeof(int)*8;bit++) {

int mask=(1<<bit);

if (y&mask) prod=prod*xpow; /* include this power of x */

if (y<mask) break; /* no higher powers of x included in y */

xpow=xpow*xpow; /* find next higher square */

}

return prod;

}

For example, raising 2 to the 100th power takes only 8 iterations with this "fast exponentiation" method, but over 100 iterations with slow exponentiation.

The fast exponentiation trick applies to all sorts of stuff, but for multi-precision numbers, that exponential speedup gets even bigger. There's also an exact analog for multiplication!

class float_hilo {This results in:

public:

float hi; /* big bits */

float lo; /* small bits (that would have been rounded off from hi) */

/* The magnitude of a must be bigger than the magnitude of b. */

void from_add(float a,float b) {

hi = a + b; /* high bits are easy--just do add normally */

float should_b = hi - a; /* in a perfect world, this would be b... */

lo = b - should_b; /* ... but our arithmetic isn't perfect: calculate the difference. */

}

};

int foo(void) {

float a=10000000.0, b=0.1234567;

float f=a+b;

float_hilo h; h.from_add(a,b);

std::cout<<setprecision(20);

std::cout<<" ordinary: "<<f<<"\n";

std::cout<<" hi: "<<h.hi<<" lo: "<<h.lo<<"\n";

return 0;

}

ordinary: 10000000Note that for this example, we've split up the integer and fraction parts. This isn't always the case; for example, changing the above to set a to 1 million leaves a few bits below the decimal point:

hi: 10000000 lo: 0.12345670163631439209

Program complete. Return 0 (0x0)

ordinary: 1000000.125Setting a to 10000 leaves more bits after the decimal point:

hi: 1000000.125 lo: -0.0015432983636856079102

Program complete. Return 0 (0x0)

ordinary: 10000.123046875

hi: 10000.123046875 lo: 0.00040991604328155517578

Program complete. Return 0 (0x0)