Ewan Delanoy on Tue, 26 Aug 2025 15:29:16 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Problem with storage of large data


Dear Aurel,

>You should
>1/ use readvec for large data
>2/ not perform operations in data files (in particular, do not store polynomials as sums of products coef * x^i)

Indeed, that is very helpful, thanks a lot !

In my case, the data took a few secs to be written and a minute to be read (as you say, this comes probably from reducing the fractions and copying).
For the record, here is the code I wrote, inspired by your suggestions :


store_list_of_univariates_in_readvec_form(list_of_polys,vaar,common_degree,filename_prefix)={
   my(file_for_numers,file_for_denoms,common_size,poly,coeff);
   file_for_numers=concat(filename_prefix,"_numerators");
   file_for_denoms=concat(filename_prefix,"_denominators");
   common_size=common_degree+1;
   for(j=1,length(list_of_polys),\
   poly=list_of_polys[j];\
   for(k=1,common_size,\
   coeff=simplify(polcoeff(poly,common_size-k,vaar));\
   write(file_for_numers,numerator(coeff));\
   write(file_for_denoms,denominator(coeff));\
   ););
}

retrieve_list_of_univariates_using_readvec(vaar,common_degree,filename_prefix)={
   my(file_for_numers,file_for_denoms,numers,denoms,nbr_of_coeffs,coeffs,\
   nbr_of_polys,common_size);
   file_for_numers=concat(filename_prefix,"_numerators");
   file_for_denoms=concat(filename_prefix,"_denominators");
   numers=readvec(file_for_numers);
   denoms=readvec(file_for_denoms);
   nbr_of_coeffs=length(numers);
   coeffs=vector(nbr_of_coeffs,j,numers[j]/denoms[j]);
   nbr_of_polys=nbr_of_coeffs/(common_degree+1);
   common_size=common_degree+1;
   return(vector(nbr_of_polys,j,Pol(coeffs[(j-1)*common_size+1..j*common_size],vaar)));
}



/*

Test on a very small example

original_list=[Pol([5/6,-2/3,7,11,-1/57],x),Pol([8,1],x),Pol([2,-1,0,0,0],x)]
store_list_of_univariates_in_readvec_form(original_list,x,4,"amy");
retrieved_list=retrieve_list_of_univariates_using_readvec(x,4,"amy");
compare_lists=(retrieved_list==original_list)


*/

---- On Mon, 25 Aug 2025 11:59:17 +0200 Aurel Page <aurel.page@normalesup.org> wrote ---

Dear E.D.,

You should
1/ use readvec for large data
2/ not perform operations in data files (in particular, do not store polynomials as sums of products coef * x^i)

Example:

\\fake data
nbpol = 700;
dgpol = 71;
lgpol = dgpol+1;
ncoefs = nbpol*lgpol;
vN = vector(ncoefs,i,random(10^10000));
vD = vector(ncoefs,i,random(10^10000));
for(i=1,ncoefs,write("fileN",vN[i]));
for(i=1,ncoefs,write("fileD",vD[i]));

\\read data
vN = readvec("fileN");
vD = readvec("fileD");
vrat = vector(ncoefs,i,vN[i]/vD[i]);
vpol = vector(nbpol,i,Pol(vrat[lgpol*(i-1)+1 .. lgpol*i]));

\\basic sanity check
Set(apply(poldegree,vpol))

Maybe a file in readvec format with each line being Pol([ list of coefficients as rational numbers ]) would also be ok, I have not tried.

This could also be made more efficient in library mode (avoid copies, do not compute gcd between numerator and denominator, etc).

Cheers,
Aurel

On 25/08/2025 11:12, Ewan Delanoy wrote:


I've got a large (1.2 GB, around 100000 lines) gp file that the GP interpreter cannot read - when I call `read` on that file the interpreter, after a few minutes, prints "Killed" and exits suddenly.

The issue is only one of file size, because the operations in the file are very elementary - it only uses addition, multiplication and accessing/setting elements of an array.

My intent was to store the results of somewhat long computations. The values involved are all univariate polynomials, and the heavy core of the data is a set of about 700 polynomials of degree 71, whose coefficients are rationals whose numerators and denominators have up to 10000 digits.

When I simply and naively use the `write` function to put all this data into a file, I get this
large GP-unreadable file.

Any ideas on how to make this storage work ?

E. D.