Karim Belabas on Sun, 24 Dec 2017 15:57:45 +0100 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: t_QUAD printed representation |
* Sam Steingold [2017-12-22 19:59]: >> * Karim Belabas [2017-12-22 16:21:18 +0100]: >>> * Sam Steingold [2017-12-22 15:37]: >>> >>> For clarity sake: the issue is read/print consistency. I should be able to >>> save and recover an object using its text representation >> >> This is sadly not the case. The only way to "save and recover" an >> *identical* object is to use writebin / read. > > I assume you mean gidentical()-level of equivalence. Yes. > For now I assume "functional equivalence" in the spirit of CL `equal` > (http://clhs.lisp.se/Body/f_equal.htm). > > As a Lisper, I think that human-readable read/print consistency is a > critical feature. It would be nice if it were implemented. > > > ( And even then, this > > depends on a sane session context : variable ordering, etc... ) > > One could argue about relevance of "session context" at length. ;-) > However, the basic situation: copy/paste the last textual output into > the current prompt to get gidentical return value - seems like a > no-brainer. It's impossible in full generality: floating point numbers are stored in binary (64-bit words, say) and output in base 10. Lossless conversion between floating point base-10 and base-2^64 is impossible, every such involves a rounding error. E.g. \\ a simple float (15:00) gp > sqrt(2) %1 = 1.4142135623730950488016887242096980786 (15:00) gp > \x [&=00007f93a0671300] REAL(lg=4,CLONE):0500000000000004 (+,expo=0):6000000000000000 b504f333f9de6484 597d89b3754abe9f ^^----- see there \\ copy-pasted from the output above (15:00) gp > 1.4142135623730950488016887242096980786 %2 = 1.4142135623730950488016887242096980786 (15:01) gp > \x [&=00007f93a06713f0] REAL(lg=4,CLONE):0500000000000004 (+,expo=0):6000000000000000 b504f333f9de6484 597d89b3754abea4 ^^----- see there Those real numbers are "close" but certainly not functionally equivalent. > >> without special cases for t_QUAD (and what other types?) > > > > t_QUAD is the worst but there are other caveats. > > Are they listed anywhere? No, sorry. The main showstopper is inexact floating point data. Besides that it would be possible BUT detrimental to readability. Here are some further examples 1) t_QUAD is a relatively unimportant concept, that could be very easily solved by printing 'w' as 'quadgen(D)' instead. E.g. instead of (15:01) gp > quadunit(8) %1 = 1 + w we would print (15:01) gp > quadunit(8) %1 = 1 + quadgen(8) Less readable but possible. 2) t_FFELT for finite fields elements (15:02) gp > ffgen(8,'t) %2 = t (15:02) gp > ffgen(8) %2 = ffgen(8) Here the readability problem is more accute: (15:03) gp > random(ffgen(8,'t)*x^10) %3 = (t^2 + t + 1)*x^10 + (t + 1)*x^9 + t^2*x^8 + (t + 1)*x^7 + (t^2 + t)*x^6 + x^5 + (t^2 + t + 1)*x^4 + (t^2 + t)*x^3 + (t^2 + 1)*x^2 + t^2*x would become %3 = (ffgen(8)^2 + ffgen(8) + 1)*x^10 + (ffgen(8) + 1)*x^9 + ffgen(8)^2*x^8 + (ffgen(8) + 1)*x^7 + (ffgen(8)^2 + ffgen(8))*x^6 + x^5 + (ffgen(8)^2 + ffgen(8) + 1)*x^4 + (ffgen(8)^2 + ffgen(8))*x^3 + (ffgen(8)^2 + 1)*x^2 + ffgen(8)^2*x 3) Further minor stuff, that can be easily avoided, e.g. (15:22) gp > matrix(0,5) %4 = [;] \\ all 0 x n matrices are printed the same with default output format (15:22) gp > \o0 output = 0 (raw) (15:22) gp > matrix(0,5) %5 = matrix(0,5) >>> IOW, for any object x and any string s, the following should hold: >>> >>> gequal(gp_read_str(GENtostr_raw(x)),x) = 1 >>> strcmp(s,GENtostr_raw(gp_read_str(x))) = 0 >> >> It doesn't hold in general > > I noticed :-( > However, do you agree that, in general, it _should_ hold? > ("normatively", not "positively"). Normatively, yes. But it must not make the output harder to read or understand (for the majority of users...). >> the only way to reliably transmit a completely general object is to >> serialize it and use binary data (we use this intensively e.g for MPI >> interface), not convert to string and back. > > Of course. > Some objects (e.g., an open stream) cannot be serialized. > However, if you CAN serialize an object, you should. > And if you cannot, the serialization (=== printed representation) should > indicate that this string cannot be read back. > E.g., Lisp uses the "#<...>" syntax for objects that cannot be read back. > >> What is your exact use case ? > > I do not have a specific use case. To offer a straw man argument, if I > observed "1+4=7", would you ask me for a use case too? ;-) > Or, in reference to the recent Pandas havoc, if GP raised an error on > "sum(i=1,0,1/i)" -- would you ask me for a use case before fixing to > return 0? ;-) No, but this is a question of mathematical correctness, not a backward incompatible change to user interface. Anyway, I see the "copy/paste doesn't always work" problem as a bug of the 'minor annoyance' class. "Fixing" it in all cases is impossible; fixing specific instances involves some work and might result in a bigger annoyance for more users. So I ask what you actually want to achieve, besides general design principles I can agree with :-). For instance we can make 'raw output format' (\o0 or default(output,0) NOT the default) output t_QUAD using quadgen() and t_FFELT using ffgen(). Would that be useful ? sufficient ? annoying ? Cheers, K.B. -- Karim Belabas, IMB (UMR 5251) Tel: (+33) (0)5 40 00 26 17 Universite de Bordeaux Fax: (+33) (0)5 40 00 21 23 351, cours de la Liberation http://www.math.u-bordeaux.fr/~kbelabas/ F-33405 Talence (France) http://pari.math.u-bordeaux.fr/ [PARI/GP] `