Bill Daly on Thu, 08 Apr 1999 19:30:24 -0400 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
ECM code in PARI |
I've been looking at the implementation of ECM factorization in ifactor1.c, and I have a couple of comments. 1. Computing (n)P The code uses Peter Montgomery's PRAC algorithm to create a Lucas chain for the purpose of computing (n)P. However, this algorithm was designed to handle calculations based on a recursion of the form x[m+n] = f(x[m], x[n], x[m-n]) on the assumption that all three parameters are required to calculate f. For an elliptic curve, the calculation of (m+n)P requires only the coordinates of (m)P and (n)P, (m-n)P not being needed. Therefore, a Lucas chain is not necessary for this purpose, and an ordinary addition chain will do. Since in general an addition chain is slightly shorter than a Lucas chain, it would make sense to switch to an addition chain approach. Also, since it is as easy to subtract two points as it is to add them, it would make sense, when computing (n)P, if n = 3 mod 4, to calculate (n)P = (n+1)P - P. This ensures that there will be at least twice as many doubling steps as adding steps, so that the length of the addition chain will not exceed 1.5*lg(n). This is certainly better than could be obtained in general from an ordinary addition chain or a Lucas chain. Note that there is no advantage in this case to using ellmult to multiply a point by a prime. For addition chains, as opposed to Lucas chains, factorization is not all that helpful. It would make more sense to multiply by the product of several small primes. 2. Avoiding invmod In principle, it is possible to carry out the entire algorithm without using invmod at all until the very end. Essentially, this can be done by clearing the denominators in the formulas for doubling and adding points, with the denominators kept in a separate array. If a point is represented by a 3-vector [x,y,d], corresponding to the usual 2-vector [x/d^2,y/d^3], it is possible to write versions of elladd() and elldouble() which compute [x2,y2,d2] = (2)[x1,y1,d1] and [x3,y3,d3] = [x1,y1,d1]+[x2,y2,d2] without requiring division mod N, i.e. without calling invmod. Unfortunately, this doesn't work well for adding two points, since the new code would require 4*sqri + 13*mulii calls, while the current implementation uses 1*sqri + 5*mulii calls. However, the new code for doubling a point would require only 6*sqri + 3*mulii calls instead of the current 2*sqri + 5*mulii calls, and assuming that a sqri is about twice as fast as a mulii, the code should about break even, except that the invmod can be avoided. Here is a PARI/GP implementation of elldouble which does this. It takes as input a 3-vector u = [ux,uy,ud], and returns a 3-vector v = [vx,vy,vd] such that [vx/vd^2,vy/vd^3] is equal to (2)[ux/ud^2,uy/ud^3]. I have assumed a curve of the form y^2 = x^3 + A*x + B, but of course one can always assume that A = 1, as is the case in ifactor1.c. I did not count A*t2 as a mulii for this reason. {elldouble(u/*,...local variables*/) = ux = u[1]; uy = u[2]; ud = u[3]; t1 = ud*ud; t2 = t1*t1; t3 = 3*ux*ux + A*t2; t4 = t3*t3; t5 = 2*uy*uy; t6 = 2*ux*t5; vx = t4 - 2*t6; vy = -(2*t5*t5 + t3*(t4 - 3*t6)); vd = 2*uy*ud; [vx,vy,vd]; } This is also going to require an extra array to hold the denominators. It is also possible to define an elladd along the same lines. The following PARI/GP implementation will show why this is not a good idea. The input is two 3-vectors [ux,uy,ud] and [vx,vy,vd], and the output is a 3-vector [wx,wy,wd]. {elladd(u,v/*,...local variables*/) = ux = u[1]; uy = u[2]; ud = u[3]; vx = v[1]; vy = v[2]; vd = v[3]; r1 = ud*ud; s1 = vd*vd; r2 = ux*s1; s2 = vx*r1; t2 = r2 - s2; r3 = uy*s1*vd; s3 = vy*r1*ud; t3 = r3 - s3; t4 = t2*t2; t5 = t3*t3; t6 = r2 + s2; wx = -t4*t6 + t5; wy = t4*(r3*(t6+s2)-s3*(t6+r2)) - t3*t5; wd = ud*vd*t2; [wx,wy,wd]; } To use the elldouble logic but not the elladd logic, it will be necessary to do an invmod any time that an elldouble step is followed by an elladd step. Essentially, this should convert [x,y,d] to [x/d^2,y/d^3,1], which is equivalent. If the proposed change to the addition chain logic is implemented, this should eliminate at least 2/3 of the invmod calls. Note that the 3-vector [x,y,d] effectively defines a point [x,y] on the curve y^2 = x^3 + (A*d^4)*x + (B*d^6), which is of course equivalent to the original curve. Regards, Bill ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. **********************************************************************