Pascal Molin on Wed, 05 Feb 2014 13:29:44 +0100


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: handling several substacks ?


I was about to report exactly the same experiment (the range N=50000, K=1000 is actually
the target range): the problem was only my low_stack error.

Interestingly, the pari size does not have any influence

(13:21) gp > default(parisize,"8M")
  ***   Warning: new stack size = 8000000 (7.629 Mbytes).
(13:21) gp > cloop1(50000,700)
time = 14,841 ms.

(13:21) gp > default(parisize,"128M")
  ***   Warning: new stack size = 128000000 (122.070 Mbytes).
(13:21) gp > cloop1(50000,700)
time = 14,659 ms.

which confirms that the gerepilecopy is cheap.

Thank you very much for the answers,

-- 
Pascal



2014-02-05 Bill Allombert <Bill.Allombert@math.u-bordeaux1.fr>:
On Wed, Feb 05, 2014 at 12:20:04PM +0100, Karim Belabas wrote:
> * Bill Allombert [2014-02-05 11:49]:
> > On Wed, Feb 05, 2014 at 10:52:13AM +0100, Pascal Molin wrote:
> > > I have to deal with the following kind of loops:
> > >   {
> > >     /* main loop */
> > >     pari_sp av1 = avma, lim1 = stack_lim(avma, 1);
> > >     long n;
> > >     for (n = 1; n <= N; ++n)
> > >     {
> > >         long i, j, k;
> > >         i = random_Fl(L)+1;
> > >         j = random_Fl(L)+1;
> > >           for (k = 1; k <= K; ++k)
> > >           {
> > >             gel(gel(a, i), k) = gmul( gel(gel(a, j), k), gel(gel(m, i), k));
> > >             if (avma > lim1)
> > >               a = gerepilecopy(av1, a);
> [...]
> > Try the new C file in attachment.
> >
> > ? gploop()
> > time = 2,280 ms.
> > %4 = 0
> > ? cloop()
> > time = 1,028 ms.
> > %5 = 0
>
> To compare with Bill's implementation of the recommanded approach,
> here's the complete version using clones (cloop2).
>
> (12:00) gp > cloop(50000,1000,8)
> time = 15,912 ms.
>
> (12:00) gp > cloop2(50000,1000,8)
> time = 23,688 ms.
>
> (12:00) gp > gploop(50000,1000,8)
> time = 35,580 ms.
>
> So the recommanded way is a clear winner in ordinary situations. As expected,
> clones are less efficient (here we use them in the simplest
> possible way, and do little useful work, so their overhead is large).
>
> But thay have a much better worse case behaviour: in "desperation mode,
> when too little memory was available to start with, low_stack will be
> triggered frequently; in the worse case, each scalar operation triggers
> a gerepilecopy of the full array...

But this assumes there is extra memory available for the clone,
in which case it would be possible to use a larger stack.

Cheers,
Bill.