Karim Belabas on Sun, 24 Dec 2017 15:57:45 +0100


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: t_QUAD printed representation


* Sam Steingold [2017-12-22 19:59]:
>> * Karim Belabas [2017-12-22 16:21:18 +0100]:
>>> * Sam Steingold [2017-12-22 15:37]:
>>> 
>>> For clarity sake: the issue is read/print consistency. I should be able to
>>> save and recover an object using its text representation
>>
>> This is sadly not the case. The only way to "save and recover" an
>> *identical* object is to use writebin / read.
> 
> I assume you mean gidentical()-level of equivalence.

Yes.

> For now I assume "functional equivalence" in the spirit of CL `equal`
> (http://clhs.lisp.se/Body/f_equal.htm).
> 
> As a Lisper, I think that human-readable read/print consistency is a
> critical feature.  It would be nice if it were implemented.
> 
> > ( And even then, this
> > depends on a sane session context : variable ordering, etc... )
> 
> One could argue about relevance of "session context" at length. ;-)
> However, the basic situation: copy/paste the last textual output into
> the current prompt to get gidentical return value - seems like a
> no-brainer.

It's impossible in full generality: floating point numbers are stored in
binary (64-bit words, say) and output in base 10. Lossless conversion
between floating point base-10 and base-2^64 is impossible, every such
involves a rounding error. E.g.

\\ a simple float
(15:00) gp > sqrt(2)
%1 = 1.4142135623730950488016887242096980786
(15:00) gp > \x
[&=00007f93a0671300] REAL(lg=4,CLONE):0500000000000004 (+,expo=0):6000000000000000 b504f333f9de6484 597d89b3754abe9f 
                                  ^^----- see there
\\ copy-pasted from the output above
(15:00) gp > 1.4142135623730950488016887242096980786
%2 = 1.4142135623730950488016887242096980786
(15:01) gp > \x
[&=00007f93a06713f0] REAL(lg=4,CLONE):0500000000000004 (+,expo=0):6000000000000000 b504f333f9de6484 597d89b3754abea4 
                                  ^^----- see there
Those real numbers are "close" but certainly not functionally equivalent.

> >> without special cases for t_QUAD (and what other types?)
> >
> > t_QUAD is the worst but there are other caveats.
> 
> Are they listed anywhere?

No, sorry. The main showstopper is inexact floating point data. Besides
that it would be possible BUT detrimental to readability. Here are some
further examples

1) t_QUAD is a relatively unimportant concept, that could be very easily
solved by printing 'w' as 'quadgen(D)' instead. E.g. instead of

(15:01) gp > quadunit(8)
%1 = 1 + w

we would print

(15:01) gp > quadunit(8)
%1 = 1 + quadgen(8)

Less readable but possible.

2) t_FFELT for finite fields elements

(15:02) gp > ffgen(8,'t)
%2 = t

(15:02) gp > ffgen(8)
%2 = ffgen(8)

Here the readability problem is more accute:

(15:03) gp > random(ffgen(8,'t)*x^10)
%3 = (t^2 + t + 1)*x^10 + (t + 1)*x^9 + t^2*x^8 + (t + 1)*x^7 + (t^2 + t)*x^6 + x^5 + (t^2 + t + 1)*x^4 + (t^2 + t)*x^3 + (t^2 + 1)*x^2 +
t^2*x

would become

%3 = (ffgen(8)^2 + ffgen(8) + 1)*x^10 + (ffgen(8) + 1)*x^9 + ffgen(8)^2*x^8 + (ffgen(8) + 1)*x^7 + (ffgen(8)^2 + ffgen(8))*x^6 + x^5 + (ffgen(8)^2 + ffgen(8) + 1)*x^4 + (ffgen(8)^2 + ffgen(8))*x^3 + (ffgen(8)^2 + 1)*x^2 + ffgen(8)^2*x

3) Further minor stuff, that can be easily avoided, e.g.

(15:22) gp > matrix(0,5)
%4 = [;]  \\ all 0 x n matrices are printed the same with default output format

(15:22) gp > \o0
   output = 0 (raw)
(15:22) gp > matrix(0,5)
%5 = matrix(0,5)

>>> IOW, for any object x and any string s, the following should hold:
>>> 
>>> gequal(gp_read_str(GENtostr_raw(x)),x) = 1
>>> strcmp(s,GENtostr_raw(gp_read_str(x))) = 0
>>
>> It doesn't hold in general
> 
> I noticed :-(
> However, do you agree that, in general, it _should_ hold?
> ("normatively", not "positively").

Normatively, yes. But it must not make the output harder to read or understand
(for the majority of users...).

>> the only way to reliably transmit a completely general object is to
>> serialize it and use binary data (we use this intensively e.g for MPI
>> interface), not convert to string and back.
> 
> Of course.
> Some objects (e.g., an open stream) cannot be serialized.
> However, if you CAN serialize an object, you should.
> And if you cannot, the serialization (=== printed representation) should
> indicate that this string cannot be read back.
> E.g., Lisp uses the "#<...>" syntax for objects that cannot be read back.
> 
>> What is your exact use case ?
> 
> I do not have a specific use case.  To offer a straw man argument, if I
> observed "1+4=7", would you ask me for a use case too? ;-)
> Or, in reference to the recent Pandas havoc, if GP raised an error on
> "sum(i=1,0,1/i)" -- would you ask me for a use case before fixing to
> return 0?  ;-)

No, but this is a question of mathematical correctness, not a backward
incompatible change to user interface.

Anyway, I see the "copy/paste doesn't always work" problem as a bug of
the 'minor annoyance' class. "Fixing" it in all cases is impossible;
fixing specific instances involves some work and might result in a
bigger annoyance for more users.

So I ask what you actually want to achieve, besides general design
principles I can agree with :-).

For instance we can make 'raw output format' (\o0 or default(output,0)
NOT the default) output t_QUAD using quadgen() and t_FFELT using ffgen().
Would that be useful ? sufficient ? annoying ?

Cheers,

    K.B.
--
Karim Belabas, IMB (UMR 5251)  Tel: (+33) (0)5 40 00 26 17
Universite de Bordeaux         Fax: (+33) (0)5 40 00 21 23
351, cours de la Liberation    http://www.math.u-bordeaux.fr/~kbelabas/
F-33405 Talence (France)       http://pari.math.u-bordeaux.fr/  [PARI/GP]
`