Function: forsquarefree
Section: programming/control
C-Name: forsquarefree
Prototype: vV=GGI
Help: forsquarefree(N=a,b,seq): the sequence is evaluated, N is of the form
 [n, factor(n)], n going through squarefree integers from a up to b.
Doc: evaluates \var{seq}, where the formal variable $N$ is $[n,
 \kbd{factor}(n)]$ and $n$ goes through squarefree integers from $a$ to $b$;
 $a$ and $b$ must be integers. Nothing is done if $a>b$.

 \bprog
 ? forsquarefree(N=-3,9,print(N))
 [-3, [-1, 1; 3, 1]]
 [-2, [-1, 1; 2, 1]]
 [-1, Mat([-1, 1])]
 [1, matrix(0,2)]
 [2, Mat([2, 1])]
 [3, Mat([3, 1])]
 [5, Mat([5, 1])]
 [6, [2, 1; 3, 1]]
 [7, Mat([7, 1])]
 @eprog

 This function is only implemented for $|a|, |b| < 2^{64}$ ($2^{32}$ on a 32-bit
 machine). It uses a sieve and runs in time $O(\sqrt{|b|} + b-a)$. It should
 be at least 5 times faster than regular factorization as long as the interval
 length $b-a$ is much larger than $\sqrt{|b|}$ and get relatively faster as
 the bounds increase. The function slows down somewhat
 if $\kbd{primelimit} < \sqrt{|b|}$. It is comparable to \kbd{forfactored}, but
 about $\zeta(2) = \pi^{2}/6$ times faster due to the relative density
 of squarefree integers. With a \kbd{primelimit} of $10^6$ (the default), we
 have

 \bprog
 ? B = 10^9; for (N = B, B+10^6, factor(N))
 time = 1,603 ms.
 ? forfactored (N = B, B+10^6, [n,fan] = N)
 time = 1,080 ms.
 ? forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 602 ms.

 ? B = 10^12; for (N = B, B+10^6, factor(N))
 time = 5,785 ms.
 ? forfactored (N = B, B+10^6, [n,fan] = N)
 time = 1,173 ms.
 ? forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 628 ms.

 ? B = 10^15; for (N = B, B+10^6, factor(N))
 time = 23,819 ms.
 ? forfactored (N = B, B+10^6, [n,fan] = N)
 time = 1,386 ms.
 ? forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 775 ms.

 ? B = 10^18; for (N = B, B+10^6, factor(N))
 time = 1min, 2,915 ms.
 ? forfactored (N = B, B+10^6, [n,fan] = N)
 time = 5,327 ms.
 ? forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 3,708 ms.
 @eprog\noindent In the last two timings, \kbd{primelimit} is respectively
 less then much less than $\sqrt{B}$. Starting gp with a \kbd{primelimit}
 larger than $\sqrt{10^{18} + 10^6}$, say $2.10^9$ we obtain
 \bprog
 ? B = 10^15; forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 688 ms. \\ little change
 ? B = 10^18; forsquarefree (N = B, B+10^6, [n,fan] = N)
 time = 1,839 ms. \\ noticeable speedup
 @eprog\noindent
 In any case $\sqrt{B+10^{6}}$ is much larger than the interval length $10^{6}$
 so \kbd{forsquarefree} gets relatively slower for that reason as well.

 Note that all PARI multiplicative functions accept the \kbd{[n,fan]}
 argument natively:
 \bprog
 ? s = 0; forsquarefree(N = 1, 10^7, s += moebius(N)*eulerphi(N)); s
 time = 3,675 ms.
 %1 = 6393738650
 ? s = 0; for(N = 1, 10^7, s += moebius(N)*eulerphi(N)); s
 time = 8,399 ms. \\ slower, we must factor N. Twice.
 %2 = 6393738650
 @eprog

 The following loops over the fundamental dicriminants less than $X$:
 \bprog
 ? X = 3*10^8;
 ? for(d=1,X, if (isfundamental(d),))
 time = 2min, 21,177 ms.
 ? forfactored(d=1,X, if (isfundamental(d),))
 time = 1min, 54,361 ms.
 ? forsquarefree(d=1,X, D = quaddisc(d); if (D <= X, ))
 time = 1min, 43,358 ms.
 @eprog\noindent Note that in the last loop, the fundamental discriminants
 $D$ are not evaluated in order (since \kbd{quaddisc(d)} for squarefree $d$
 is either $d$ or $4d$) but the set of numbers we run through is the same.
 Not worth the complication since it is the same speed as testing
 \kbd{isfundamental}. A faster, more complicated approach uses two loops. For
 simplicity, assume $X$ is divisible by $4$:
 \bprog
 ? forsquarefree(d=1,X/4, D = quaddisc(d))
 time = 19,744 ms.
 ? forsquarefree(d=X/4+1,X, if (d[1] % 4 == 1,))
 time = 53,662 ms.
 @eprog\noindent This is the price we pay for a faster evaluation.

 We can run through negative fundamental discriminants in the same way:
 \bprog
 ? forfactored(d=-X,-1, if (isfundamental(d),));
 @eprog
