There are quite a few different ways to do multiplication. The simplest is to use the compiler's "*" operator, but there are a number of situations where you might want to build something yourself. For example, you might want finite field multiplication (which has a different "add" operation), big integer arithmetic (calling an "add big integers" function), or modular arithmetic (reducing by the modulus each time).

var mul_slow(var A,var B) { var product=0; for (var i=0;i<B;i++) product+=A; return product; }This has terrible performance, since it uses O(2

If you replace the "+=" with "*=", and initialize the product at 1, you can transform this slow multiply into a slow exponential.

var exp_slow(var A,var B) { var product=1; for (var i=0;i<B;i++) product*=A; return product; }It's still exponentially slow, though. (Nonetheless, a surprising number of people use this algorithm).

var mul_binary(var A,var B) { var product=0; for (var bit=0;B>=(1<<bit);bit++) { if (B&(1<<bit)) product+=A; // include A in product A+=A; // left shift A by 1 bit } return product; }This is now O(B) additions for a number with B bits, which is exponentially better than before.

Once again, replacing "+" with "*" gives the fast exponentiation by squaring trick:

var exp_binary(var A,var B) { var product=1; for (var bit=0;B>=(1<<bit);bit++) { if (B&(1<<bit)) product*=A; // include A in product A*=A; // double exponent on A } return product; }

Here, we build the mask using a clever trick with signed numbers: negation converts 0,1 to 0,0xffffffff...

var mul_AND(var A,var B) { var product=0; var Bbit=B; // == B>>bit for (var bit=0; Bbit!=0; bit++) { var mask=-(Bbit&1); // -0 == 0; -1 == all ones product+=A&mask; // optionally include in product Bbit=Bbit>>1; // move to next bit of B A+=A; // move to next bit of A } return product; }There is one difficulty in converting to exponentiation: if the mask is zero, the above will eliminate the entire product. We can use the famous(?) bitwise selection technique to multiply by 1 if the mask is zero:

var exp_AND(var A,var B) { var product=1; var Bbit=B; // == B>>bit for (var bit=0; Bbit!=0; bit++) {

var mask=-(Bbit&1); // -0 == 0; -1 == all ones product*=(A&mask)|(1&~mask); // A if mask, 1 if not mask Bbit=Bbit>>1; // move to next bit of B A*=A; // move to next bit of A } return product; }

var multiply_hardware(var A,var B) { return A*B; }The only real downside is the hardware multiply circuit can only handle 32 or 64 bit numbers, and we often need bigger numbers than this.

Karatsuba multiplication takes O(N

x*y = b^2*

By contrast, Karatsuba calculates:

x*y = (b^2+b)*

This is algebraically equivalent, but the surprising fact about this is it breaks down one N-digit multiplication into just three N/2 digit multiplications, one fewer than you might expect. However, there are several additional additions and subtractions required. For numbers longer than 300-600 bits, Karatsuba is faster than either the binary approach above or a straightforward recursive approach.

Really long numbers, on the order of 64K bits, can benefit from the Schönhage–Strassen algorithm, which uses a Fast Fourier transform to reach O(N log N log log N) time. However, numbers this big are rarely used in crypto.

Method mul slow: 3.83 ns for 2,1 9.15 ns for 2,10 48.37 ns for 2,100 314.58 ns for 2,1000 2968.24 ns for 2,10000 29560.18 ns for 2,100000 Method mul binary: 4.13 ns for 2,1 7.65 ns for 2,10 12.01 ns for 2,100 15.58 ns for 2,1000 20.93 ns for 2,10000 25.01 ns for 2,100000 Method mul AND: 4.13 ns for 2,1 5.93 ns for 2,10 8.58 ns for 2,100 10.71 ns for 2,1000 13.98 ns for 2,10000 16.52 ns for 2,100000 Method mul HW: 3.54 ns for 2,1 3.54 ns for 2,10 3.55 ns for 2,100 3.54 ns for 2,1000(Try this in NetRun now!)

3.55 ns for 2,10000 3.54 ns for 2,100000

Note how the slow multiply is bad and getting worse, the fast algorithms grow logarithmically slowly, and the hardware is constant time.