Karim Belabas on Fri, 01 Aug 2008 02:43:59 +0200 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
[arndt@jjj.de: Re: bugfix for charpoly with finite fields] |
Hi pari-dev, [ sorry, looong mail ] I received an interesting test-case from Joerg Arndt (simplified version attached). "The used function gauss_poly2() calls only pari/gp internals, else it does arithmetic with binary polynomials no more." Some timings: ver 2.3.4: 11.9 s. / 11.8 s [GMP] ver 2.4.2: 360 s. / 356 s. [ GMP ] \\ Anticipating a little bit on the semi-happy ending (the solution is awkward) current svn: 12.5s / 11.9s [ GMP ] A simpler example: { a = x^1771 + x^82 + x^8 + x; b = x^1449 + x^1448 + x^1446 + x^1412 + x^1411 + x^1408 + x^1406 + x^1403 + x^1366 + x^1362 + x^1360 + x^1317 + x^537 + x^500 + x^496 + x^494 + x^460 + x^459 + x^456 + x^454 + x^451 + x^411 + x^410 + x^408; T = x^1861 + 1; z = Mod(Mod(1,2), Mod(1,2)*T); } (z * a) * (z * b) is about 50 times slower in 2.4.2 (and onwards) than in 2.3.* It's easy to identify the cause, but it can't be reverted: 1) historically, the t_INTMOD and t_POLMOD types were introduced as a (reasonably) convenient user interface to construct "arbitrary" base rings in an otherwise typeless language. They were very slow (and still are), but did the job. All the basic polynomial arithmetic and linear algebra was written purely in terms of generic elementary operations and no explicit notion of a "base ring". 2) with time a number of "heuristic" improvements (read "incorrect" but OK in most practical circumstances) were added to improve functions like Euclidean division in a polynomial ring K[X]. In that case, we basically ignore zeroes and use some kind of sparse representation for the divisor. Nobody cared too much about "minor" problems like (01:09) gp > (x^2+Mod(1,3)*x+Mod(1,3)) % (x + Mod(0,2)) %1 = Mod(1, 3) (01:09) gp > (x^2 + 100) % (x + Mod(0,2)) %2 = 100 (01:09) gp > Mod(0,2)*x %3 = 0 \\ this is the integer 0 (t_INT). Base ring is lost (01:09) gp > Mod(0,2)/x %4 = 0 Some zeroes do carry information, even "exact zeroes". 3) in parallel, a large number of dedicated functions were written in libpari to re-implement polynomial arithmetic over specific simple base rings (like Z/nZ or Fp[x]/(T(x))), in a much more efficient way than could be done using generic operations only, i.e. the function has a predefined notion of a common base ring containing all coefficients it will handle. Much simpler to program and optimize. Unsurprisingly, all of libpari is now written in terms of these functions ( except the part implementing generic arithmetic, which are hardly ever used in libpari itself ). Unfortunately, it is hard for GP users to access these functions: the simple example above would be written \\ a,b,T as above install(Flxq_mul,GGGL); install(Flx_to_ZX,G); install(ZX_to_Flx,GL); Flx_to_ZX(Flxq_mul(ZX_to_Flx(a,2), ZX_to_Flx(b,2), ZX_to_Flx(T,2), 2)) Timings do improve: 539ms --> 16ms in 2.4.2, vs 12ms for the original code in 2.3.4 (stable remains a little more efficient because Flx_rem does not use sparse representation but Newton/FFT-based arithmetic in this example; conversions are negligible) 4) In version 2.4.2, PARI (both libpari and GP) became much stricter with the zeroes it would actually ignore, restricting severely the old improvements / hacks such as 2). Fixing a large number of bugs at the same time of course: \\ current svn (01:29) gp > (x^2+Mod(1,3)*x+Mod(1,3)) % (x + Mod(0,2)) %1 = Mod(0, 1) (01:29) gp > (x^2 + 100) % (x + Mod(0,2)) %2 = Mod(0, 2) (01:29) gp > Mod(0,2)*x %3 = Mod(0, 2) (01:29) gp > Mod(0,2)/x %4 = Mod(0, 2)/x 5) In our case, this slows down immensely Euclidean division in (Z/2Z)[X] since all the zeroes are actually Mod(0,2), which we are no longer free to ignore, and we no longer get a useful sparse representation. 6) Of course it would be trivial to solve 5) if our routines knew the base ring from the start and could answer precisely the question "what is zero ?". As an experiment, I modified a few generic functions to do just that (gdiv, gmul, gsqr operating on t_POLMODs) and try to determine a (very) simple base ring before starting. In most cases we fail immediately and go on. When we hit the jackpot and recognize Z/nZ, we call specialized functions [ and unfortunately have to convert between representations, twice: to get rid of t_INTMODs, then re-introduce them ] 7) This may slow down a little the library: most polynomials we use are in Z[X], and we have to scan them to the end to make sure no coefficient belongs to some Z/nZ [ for nothing: this will never happen in libpari: we don't use INTMODs... ] In libpari proper, this is not the case: no critical function uses the generic operations anymore, and certainly not with t_POLMOD arguments, in any case. 8) This is not a general solution, in the sense that Euclidean division in slightly more complicated base rings and sparse polynomials of huge degrees will still be abysmally slower than 2.3.*. [ As far as I can see, division is the worse case by far: other operations are much less sensitive. ] -- I do not want to revert to the old happy-go-lucky behaviour (always algebraically acceptable, but very often semantically incorrect), -- It would be very, very nice if polynomials (and matrices) carried with them their coefficient ring, without our trying to guess in each single function where exactly we are working (often giving up). Knowing precisely "what is 0" would be useful in many contexts: our current approach is too conservative and misses simplifications, as above. But I do not want to force users to specify base rings in advance or use complicated install-ed functions from libpari; so I'm stuck there. Maybe adding a "domain" tag to each object whenever we can safely do so would be doable. But can't be done without wrecking backward compatibility. (Or introducing completely new types.) -- Catering for more rings "by hand" will complicate the code, for probably marginal improvements. Any bright idea ? K.B. -- Karim Belabas, IMB (UMR 5251) Tel: (+33) (0)5 40 00 26 17 Universite Bordeaux 1 Fax: (+33) (0)5 40 00 69 50 351, cours de la Liberation http://www.math.u-bordeaux.fr/~belabas/ F-33405 Talence (France) http://pari.math.u-bordeaux.fr/ [PARI/GP] `
gauss_poly2(n, t)= { /* return field polynomial for type-t Gaussian normal basis */ /* All computations over GF(2) */ /* NOTE: slower than the algorithm using complex numbers */ local(p, M, r, F, t21, t2, Z); \\ if ( 0==gauss_test(n,t), return(0) ); p = t*n + 1; \\ print(" n=", n, " t=", t, ": p=", p); r = znprimroot(p)^n; \\ element of order t mod p \\ print(" :: r=", r, " ord(r)=", znorder(r), " == t=", t); \\ M = sum(k=0, p-1, 'x^k); \\ The polynomial modulus M = 'x^p - 1; \\ Use redundant modulus M *= Mod(1,2); \\ ... over GF(2) if ( 1==t, return( sum(k=0, p-1, 'x^k) ) ); \\ for type 1 \\ print(" :: M =", lift(M)); F = Mod(1, M); t21 = Mod(2,p); t2 = Mod(1,p); for (k=1, n, \\ print(" :: ------- k=", k); \\ for(j=0, t-1, print( lift(Mod('x^lift(t2*r^j), M) ) ) ); \\ Z = sum(j=0, t-1, 'x^lift(t2*r^j)); \\ faster (but unclean?) Z = Mod( sum(j=0, t-1, 'x^lift(t2*r^j)), M); \\ faster \\ Z = sum(j=0, t-1, Mod('x^lift(t2*r^j), M) ); \\ fast \\ Z = sum(j=0, t-1, Mod('x, M)^lift(t2*r^j) ); \\ slower \\ print(" :: Z =", lift(component(Z, 2))); F = ('x+Z)*F; \\ print(" :: w=", sum(j=0, t-1, 'x^lift(t2*r^j) ) ); \\ print(" :: w%=", lift(Mod(sum(j=0, t-1, 'x^lift(t2*r^j) ), sum(k=0, p-1, 'x^k) ) ) ); \\ print(" :: F =", lift(component(F,2)) ); t2 *= t21; ); \\ final reduction for redundant modulus: \\ M = sum(k=0, p-1, 'x^k); \\ The polynomial modulus \\ F = lift( Mod( lift(F), M) ); \\ final reduction for redundant modulus (simplified): F = lift(F); if ( 0==polcoeff(F,0), F=sum(k=0, n, (1-polcoeff(F,k))*'x^k) ); return ( F ); } /* ----- */ gauss_poly2(620, 3);