Algorithms Modulo a Prime Field

CS 463/480 Lecture, Dr. Lawlor

Quite a few abstract mathematical objects are used in crypto work, and particularly in the discrete logarithm encryption schemes (Diffie-Hellman, RSA, DSA), and in elliptic curves.  

You'd need to take Math 405--"Abstract Algebra: Theory of groups, rings, and fields."--to get the full story, but here's the cheat sheet.  Essentially, the trick to abstract algebra is to generalize the recurring aspects of specific operators like + and *, so you can use the minimum number of properties to prove theorems useful in a bunch of different domains, sort of like mathematical operator overloading.

Name
Properties (for all a, b, and c)
Examples
Magma Closure: a  b exists Quaternion multiplication.
Semigroup (all of the above, plus)
Associativity: 
      (a b) c = a (b c)
The positive integers under addition ( is +).
Monoid (all of the above, plus)
Identity: 0 exists, where a 0 = a
Non-negative integers under addition.
Group*
(all of the above, plus)
Inverse: -a exists, where a -a = 0
All integers under addition.
The integers under modular addition.
Reals under multiplication.
Integers under addition, modulo a prime.
Multiplication of invertible matrices.
Abelian 
Group

(all of the above, plus)
Commutativity: a b = b a
All of the above, except matrix multiplication.
Ring
With "addition", forms an abelian group.
Also has an associative "multiplication" operation.
Distributivity: a*(b+c) = a*b+a*c
The integers under addition and multiplication.
The reals under addition and multiplication.
Field*
(all properties of a commutative ring, plus)
Nonzero elements can be "divided": multiplicative inverses exist.
There are quite a few infinite fields (reals, complex, rationals).  
Yet every finite field of the same size is equivalent (isomorphic), a surprising result.


* Denotes names you might hear about again in this class (you can forget the rest!).

This table gets less general (with more requirements added) as you go downward.  

Groups and Rings

For crypto work, we're usually interested in finite groups.  One way to get finite integer groups is via modular arithmetic, where we wrap around to the size of the group.  If we wrap the integers, Z, around by n, we have Z%n, more commonly written as Z/nZ.

Addition on modular integers always makes a group, but there's a problem with multiplication.  The first is that zero has no multiplicative inverse, but we can cure this by not worrying about zero.  Second, look at the multiplication table for Z%4:
0 0 0 0 
0 1 2 3 
0 2 0 2 
0 3 2 1 
The problem is that multiplying by 2 doesn't have an inverse--not only does 2*2 result in the excluded zero element, 2*1=2*3=2, which means we don't have multiplicative inverses, so Z%4 is not a field.

By contrast, Z%5 works fine as a group under multiplication, with this multiplication table:
0 0 0 0 0 
0 1 2 3 4 
0 2 4 1 3 
0 3 1 4 2 
0 4 3 2 1 
(Try this in NetRun now!)

As it happens, the nonzero elements of Z%n act as a group under ordinary integer multiplication if and only if n is prime.  Because it has working addition and multiplication, Z%n is a "field".

Modular Inverse

Since Z%n is a group under multiplication, and groups have inverses, it's possible to find the multiplicative inverse of a number, where the product of a number and its inverse gives 1.  For example, in Z%5, the multiplicative inverse of 3 is 2, since 3*2=1 mod 5.  This is also called the modular inverse, since 3*2=6 without the modulus.

It is possible to find the modular inverse via brute force, by checking every number between 1 and p, but this takes 2N checks for an N-bit number.  It is much more efficient to use the extended euclidean greatest common denominator algorithm (EGCD) to find modular inverses.  The EGCD algorithm finds the constants x and y such that:
     gcd(a,b) = ax + by

We run the EGCD algorithm on the multiplier a and the prime P.  Because P is prime, the gcd of anything smaller and P is 1, so we'll get constants x and y such that:
     1 = ax + Py

Since we're operating modulo the prime, we don't care about y, and we have:
     1 = ax mod P
This indicates x is the modular inverse of a.

The EGCD algorithm operates by maintaining the invariant that ax + by is our estimate of the gcd.  Initially, our estimate is
that the gcd is either a or b, so x and y are 0 and 1.  As we reduce the estimate by repeatedly subtracting off copies of a and b, we adjust x and y to compensate.

For example, if we want to calculate the modular inverse of 10 mod 13, the EGCD would calculate:

gcd = a * x  +  b * y
13 = 10 * 0 + 13 * 1 10 = 10 * 1 + 13 * 0 3 = 10 * -1 + 13 * 1 1 = 10 * 4 + 13 * -3

Here's Z%13.  10's modular inverse is indeed 4  (4*10=1 mod 13).

mod13	0	1	2	3	4	5	6	7	8	9	10	11	12	13	
0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	
1	0	1	2	3	4	5	6	7	8	9	10	11	12	0	
2	0	2	4	6	8	10	12	1	3	5	7	9	11	0	
3	0	3	6	9	12	2	5	8	11	1	4	7	10	0	
4	0	4	8	12	3	7	11	2	6	10	1	5	9	0	
5	0	5	10	2	7	12	4	9	1	6	11	3	8	0	
6	0	6	12	5	11	4	10	3	9	2	8	1	7	0	
7	0	7	1	8	2	9	3	10	4	11	5	12	6	0	
8	0	8	3	11	6	1	9	4	12	7	2	10	5	0	
9	0	9	5	1	10	6	2	11	7	3	12	8	4	0	
10	0	10	7	4	1	11	8	5	2	12	9	6	3	0	
11	0	11	9	7	5	3	1	12	10	8	6	4	2	0	
12	0	12	11	10	9	8	7	6	5	4	3	2	1	0	
13	0	0	0	0	0	0	0	0	0	0	0	0	0	0	

Here's a Python translation of the EGCD algorithm.  The commas make tuples, used to perform multiple assignments in one step.

from time import time;

# From http://rosettacode.org/wiki/Modular_inverse#Python
def extended_gcd(aa, bb):
	lastrem, rem = abs(aa), abs(bb)
	x, lastx, y, lasty = 0, 1, 1, 0
	while rem:
		lastrem, (quotient, rem) = rem, divmod(lastrem, rem)
		x, lastx = lastx - quotient*x, x
		y, lasty = lasty - quotient*y, y
	return lastrem, lastx * (-1 if aa < 0 else 1), lasty * (-1 if bb < 0 else 1)
 
def modinv(a, m):
	g, x, y = extended_gcd(a, m)
	if g != 1:
		raise ValueError
	return x % m

# This is the prime modulus used by SECG secp256k1, the BitCoin elliptic curve
P=0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F;
G=2;
E=0xabcdef1234559daaaaaaaaaaaaaadeadbeefaaaaaaaaaaaaaaaaaaaaffff0008;

for repeat in range(3):
	start=time();
	Ei=modinv(E,P);
	print("Product = ",(E*Ei)%P);
	elapsed=time()-start;
	print("Elapsed=",elapsed," seconds");

(Try this in NetRun now!)

This runs in 0.1 milliseconds, and produces a product of 1.

Comparing Python to my hastily coded C++ big number library:

#include "osl/bignum.h"
typedef bignum<256> integer;

// This is the prime modulus used by SECG secp256k1, the BitCoin elliptic curve
integer P("0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F");
integer G=2;
integer E("0xabcdef1234559daaaaaaaaaaaaaadeadbeefaaaaaaaaaaaaaaaaaaaaffff0008");

long foo(void) {
	integer Ei=E.modInverse(P);
	return (Ei*E).mod(P)==1;
}

(Try this in NetRun now!)

This runs in 3.4 milliseconds, 34 times slower than Python!  I suspect the problem is my divide implementation doesn't use 32-bit machine integers, but operates one bit at a time.  It's still embarrassingly slow!

Comparing modular exponentiation, the built-in Python "pow" takes a third parameter indicating the modulus.
from time import time;

# This is the prime modulus used by SECG secp256k1, the BitCoin elliptic curve
P=0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F;
G=2;
E=0xabcdef1234559daaaaaaaaaaaaaadeadbeefaaaaaaaaaaaaaaaaaaaaffff0008;

for repeat in range(3):
	start=time();
	res=pow(G,E,P);
	elapsed=time()-start;
	print("Elapsed=",elapsed," seconds");

(Try this in NetRun now!)
This runs in 0.14 milliseconds.  My library runs in 28 milliseconds.