If you benchmark a naive ECDH key
exchange, it's actually quite slow, taking 0.120 seconds (!)
per exchange. This would limit a server to about 8
connections per second, which is far too slow.
The limiting factor here is the many modular inverse operations,
performed at every divide. An interesting solution is to use
projective coordinates, the "Jacobian Point Representation", to
store (X,Y,Z) such that x=X/Z2, y=Y/Z3.
This representation does not require any divisions during its
modified point addition function, although you still need to
divide at the very end to get the non-Jacobian (Z=1) value back
out.
The math here is interesting--similar to doing calculations in fixed-point arithmetic, where you need to keep track of the scale factor and do a big divide at the end, we keep the scale factor in the Jacobian's new Z coordinate.
In non-Jacobi coordinates, we're trying to compute the usual
secant line point addition scheme:
Substituting in the Jacobi coordinates x=X/Z2 and y=Y/Z3, we have
m = (Y2/Z23 - Y1/Z13) /
(X2/Z22 - X1/Z12)
To cancel out those inner fractions, harkening back to Algebra 1
we multiply the top by Z13 Z23, and the
bottom by Z12 Z22, for a net change to m of
Z1 Z2, giving us:
Z1 Z2 m = (Y2 Z13 - Y1 Z23)
/ (X2 Z12 - X1 Z22)
Following the notation used in 2001
Bernstein, we define some constants
A = X1 Z22
B = X2 Z12 - A
// denominator above
c = Y1 Z23
d = Y2 Z13 - c
// numerator above
Exactly as before, if B==0 we have a special case, either point
doubling if d==0, or infinity if d!=0.
Using the freshly defined d and B we have simply
Z1 Z2 m = d / B
or
m = d / (Z1 Z2 B)
There's no more avoiding that denominator, so we just stuff it
all directly into the output Z, making m=d/Z3
Z3 = Z1 Z2 B
Recall x3 = m2 - x1 - x2, so:
X3/Z32 = m2 - X1/Z12 - X2/Z22
Multiplying both sides by Z32, we get:
X3 = m2 Z32 - Z32 (X1 /
Z12 + X2 / Z22)
But recall m=d/Z3, and Z3=B Z1 Z2, so expanding we have
X3 = d2 - B2 Z12 Z22
(X1 / Z12+ X2 / Z22)
Once again, the fractions cancel, leaving
X3 = d2 - B2 (X1 Z22+
X2 Z12)
We can efficiently calculate the term in parenthesis as B + 2 A,
giving
X3 = d2 - B2
(B + 2 A)
# Return the "sum" of these elliptic curve points # Directly from 2001 Bernstein "add-2001-b", # http://cr.yp.to/nistp224.html opt-idea53.c ecadd. # http://www.hyperelliptic.org/EFD/g1p/auto-shortw-jacobian.html#addition-add-2001-b def add(self,Q1,Q2): # Identity special cases if (Q1.x==self.p): # Q1 is identity return Q2 if (Q2.x==self.p): # Q2 is identity return Q1 ZZ1=(Q1.z*Q1.z)%self.p ZZZ1=(ZZ1*Q1.z)%self.p ZZ2=(Q2.z*Q2.z)%self.p ZZZ2=(ZZ2*Q2.z)%self.p A=(Q1.x*ZZ2)%self.p B=(Q2.x*ZZ1-A)%self.p c=(Q1.y*ZZZ2)%self.p d=(Q2.y*ZZZ1-c)%self.p # Equality special cases if (B==0): if (d==0): # adding point to itself return self.double(Q1) else: # vertical pair--result is the identity return self.identity()
# Ordinary case e=(B*B)%self.p f=(B*e)%self.p g=(A*e)%self.p h=(Q1.z*Q2.z)%self.p f2g=2*g+f X3=(d*d-f2g)%self.p Z3=(B*h)%self.p gx=g-X3 Y3=(d*gx-c*f)%self.p return ECpoint(self,X3,Y3,Z3)
Download this full Python3 jacobi elliptic curve point multiplication code here.
Many many other formulations are possible; see hyperelliptic's
automated collection of elliptic curve formula.
# Return a doubled version of this elliptic curve point # Closely follows Gueron & Krasnov 2013 figure 2 def double(self,Q): if (Q.x==self.p): # doubling the identity return Q S=(4*Q.x*Q.y*Q.y)%self.p Z2=Q.z*Q.z Z4=(Z2*Z2)%self.p M=(3*Q.x*Q.x+self.a*Z4) x=(M*M-2*S)%self.p Y2=Q.y*Q.y y=(M*(S-x)-8*Y2*Y2)%self.p z=(2*Q.y*Q.z)%self.p return ECpoint(self,x,y,z) # Return the "sum" of these elliptic curve points # Closely follows Gueron & Krasnov 2013 figure 2 def add(self,Q1,Q2): # Identity special cases if (Q1.x==self.p): # Q1 is identity return Q2 if (Q2.x==self.p): # Q2 is identity return Q1 Q1z2=Q1.z*Q1.z Q2z2=Q2.z*Q2.z xs1=(Q1.x*Q2z2)%self.p xs2=(Q2.x*Q1z2)%self.p ys1=(Q1.y*Q2z2*Q2.z)%self.p ys2=(Q2.y*Q1z2*Q1.z)%self.p # Equality special cases if (xs1==xs2): if (ys1==ys2): # adding point to itself return self.double(Q1) else: # vertical pair--result is the identity return self.identity() # Ordinary case xd=(xs2-xs1)%self.p # caution: if not python, negative result? yd=(ys2-ys1)%self.p xd2=(xd*xd)%self.p xd3=(xd2*xd)%self.p x=(yd*yd-xd3-2*xs1*xd2)%self.p y=(yd*(xs1*xd2-x)-ys1*xd3)%self.p z=(xd*Q1.z*Q2.z)%self.p return ECpoint(self,x,y,z)
# "Multiply" this elliptic curve point Q by the integer m # Often the point Q will be the generator G def mul(self,Q,m): R=self.identity() # return point while m!=0: # binary multiply loop if m&1: # bit is set R=self.add(R,Q) m=m>>1 if (m!=0): Q=self.double(Q) return R # A point on an elliptic curve: (x,y) # Using special Jacobian point representation class ECpoint: """A point on an elliptic curve (x/z^2,y/z^3)""" def __init__(self,curve, x,y,z): self.curve=curve self.x=x self.y=y self.z=z
# Extract non-projective X and Y coordinates # This is the only time we need the expensive modular inverse def get_x(self): return self.curve.field_div(self.x,(self.z*self.z)%self.curve.p); def get_y(self): return self.curve.field_div(self.y,(self.z*self.z*self.z)%self.curve.p);
This takes
0.017 seconds per ECDH operation, over 7x faster than a non-Jacobi
implementation!
You can
improve performance even more by building a static table of the
powers of two of the curve generator point, which more than
doubles the speed of the initial multiplications compared to
computing them one by one, as we do here.
Download my full Python3
jacobi elliptic curve point multiplication code here.