Re: Why Mathematica does not issue a warning when the calculations
- To: mathgroup at smc.vnet.net
- Subject: [mg117649] Re: Why Mathematica does not issue a warning when the calculations
- From: Richard Fateman <fateman at eecs.berkeley.edu>
- Date: Tue, 29 Mar 2011 06:53:29 -0500 (EST)
On 3/26/2011 10:22 PM, Daniel Lichtblau wrote: > Richard, if you choose to follow up, please either use an email client that respects maintaining of line spacing or else manually insert line spaces. I've been spending time disentangling our various remarks. I send them out well spaced, only to see them elided in replies. I find this annoying too. I'm using Mozilla Thunderbird 3.1.9 and Google about line wrap reveals other complaints. Either it wraps after a preset number of characters (I found in the advanced settings that it should wrap at 72) or sometimes it doesn't. But if a quoted section adds a character or 2 to the left, then the line wrapping happens extra times. I suspect this has to do with non-HTML formatting. At this point I think I have inherited html formatting from the mail you sent me, in which the line above (that is, yours) is all one line. As is this line. I ordinarily prefer fixed width text lines so I can be assured of the spacing of math text. Sometimes I find other people's mail just runs off the right side of the page, and sometimes it is annoyingly wrapped, as mine. So I'm not doing it deliberately; though exactly how to fix it in Thunderbird 3.1.9 has eluded me. As I said, this is all being typed on one line. I hope you can read it. > >> (RJF) We've discussed this before. If you have a true function f, that >> is a procedure whose results depend only on its argument(s), then when >> A is equal to B, f(A) should be equal to f(B). This is fundamental to >> programming and to mathematics, but is violated by Mathematica. >> (DL) Probably by other mathematical programming languages as well. >> Especially in the operating world of finite precision arithmetic. It's >> something people often get used to (of necessity, if you work on this >> stuff). >> (RJF...[Why is it I need to reformat to show who wrote what?--dl]) I think this is text mode vs. html mode. My mailer now has decided to use a variable width font. This too is annoying, and I know how to change it back, but only temporarily. >> I disagree that this is a common problem, although it would >> probably not be >> allowed to discuss particular other mathematical programming languages >> in this forum. >> So far as I can tell, it has nothing to do with finite precision >> arithmetic unless you >> have some notion that "equality" means something like "close enough >> for government work". > Numeric conditioning combined with truncation or cancellation error can have the effect of messing up the proposed equivalence. You actually show that quite clearly below, in analyzing the Mathematica behavior of the original example at machine precision. This is simply not my understanding of any language (exception Iverson's APL had "approximately equal", I think). If two numbers x, y in Fortran are equal, then any arithmetic function f of x will be equal to an arithmetic function f of y. (There are exceptions to this if the function f is not arithmetical or x,y are perhaps not numbers ... examples: if you can ask "what is the address in memory of x" and "what is the address in memory of y" they might be equal or not. But this is not an arithmetic function. A second exception is that in IEEE 754 standard binary floats, +0 and -0 are equal. Yet you can distinguish them with a function that looks at their sign. However, if you do ordinary non-trapping arithmetic on +0 it should result in the same exact bit configuration as -0. Also not-a-numbers (NaNs) are supposed to compare not-equal EVEN IF THEY HAVE THE SAME BITS. If you have two numbers that are equal and you subtract them, then cancellation cancels all the bits and you get zero. You don't have any cancellation error. If you have two numbers that are equal and you truncate them both to the same number of bits, they are still equal. If you have two numbers that are equal and you truncate them to different numbers of bits then then may be different or the same. If you have two numbers that have different numbers of bits and you compare them, you have to decide whether to discard the extra bits on the longer number, or extend the shorter one (presumably with zeros). If x and y have the same initial n bits, but x has an additional m non-zero string of bits, then the two numbers are not the same identical number, except Mathematica says so. Not just numerically equal (==) but identical. Languages that support both single and double floats (or additional float formats) as well as various lengths of integers have to address this problem, say if x and y were of different types. They could simply give a type-mismatch error and object that one cannot compare for numerical equality integers of differing lengths 8, 16, 32, 64 ... Or in comparing 8 and 16 bit quantities they could compare only the first 8 and say they were equal. Or extend the 8-bit quantity with zeros, or extend it with the random 8 bits following it in memory. And similarly for doing arithmetic. But if two quantities have different bit configurations, I doubt that it makes sense to say they are IDENTICAL. There is another set of objects, somewhat related here, that of Intervals. x=Interval[{-1,1}] y=Interval[{-1,1}] There are several questions that can be asked. Which of the following makes sense mathematically and does it agree with what Mathematica returns? x==x ? x===x? x==y? x===y? x-y==0? x-x==0? The answers to these depends on whether Mathematica recognizes dependent intervals or not. It apparently does not. There is a substantial literature on "Reliable Computation"; Mathematica's overloading of "==" for Interval comparison is probably risky. Oh, Infinity==Infinity but ... And perhaps I found a bug.. Max[Indeterminate,Infinity] --> ??? Indeterminate. > Significance arithmetic might exacerbate this in different cases, but the issue is quite present with fixed precision. Maybe an example here of another language or system where this issue occurs would help? > > (RJF) >> In Mathematica there are two kinds of equality: == and === . The >> latter means IDENTICAL. >> Consider a=SetPrecision[2,40] b=SetPrecision[2,50]. I could >> sympathize with a==b being >> true, but certainly not a===b, which says they are IDENTICAL, and in >> my book, f[a] must equal >> f[b] for all functions f. Yet let f=Precision, and you see we get >> different answers. > SameQ for numeric inputs has semantics as thus: if both are approximate, compare to last bit (or possibly next-to-last, I do not recall that specific detail). You may not care for those semantics, but > RJFDislike =!= SemanticsAreWrongOrEvenNecessarilyBad > (Just evaluate that in Mathematica if you don't believe me...) It says except for the last bit. Why? Why not require them to be identical, same type, same precision, same accuracy? It seems to me that Mathematica owes us some OTHER test for really really identical. In Michael Trott's Mathematica Guidebook for Numerics he suggests that Experimental`$SameQTolerance would affect this, but apparently not in my version of Mathematica, where it is missing. Could it be that other people thought that this last-bit "semantics" was maybe not something to be set in concrete, at least at some previous time? > (RJF) >> 2.000000000000000001000 == 2 is False, but >> 2.00000000000000000100 == 2 is True. >> >> Such rules of the road lead to situations in which a==b but a-b is >> not zero, > True. As you are aware from numerics literature, there will always be anomalies of that sort (and I also am not thrilled with that one). Perhaps you could point to a situation in a major programming language outside of Mathematica in which a and b are legitimate finite-value representable floating-point scalar numbers that are equal and yet a-b is not zero? > It is a matter of implementing semantics that we deem preferable from amongst suboptimal choices. Some undesirable features will manifest regardless. I think you know that. I agree that there are threads connecting significance arithmetic to equality testing. > > (RJF) >> This is not the understanding in Mathematica, where (for software >> bigfloats) 2.000000000000000000 is more like shorthand for "all >> numbers between 1.9999999999999999995 2.0000000000000000005 " (DL may >> be willing and able to explain what Mathematica REALLY does) >> (DL, Once again inserting demarcation...) >> Something along the lines you suggest. Your alternative approach is fine for >> handling of integers. It does not work so well for inputs such as 1.3 >> which do not translate exactly to finite binary representations. But >> Mathematica semantics would still be violated by upping precision, and >> that in no way contradicts the documentation remarks about ratcheting >> precision internally, as they refer to handling of exact input >> (N[2,10] et al are not exact, and they are what FractionalPart sees). >> (RJF, Flailing at a badly maimed equine) >> The simplest way of having consistent mathematical semantics in a >> programming language is to have the programming language obey the >> rules of mathematical semantics. >> ( DL interjecting) > Gimme a break. (That's what mathematicians say when computer scientists lecture them about the dark interstices between the fields). > > (RJF, continuing) >> Each step away from this, as imposed >> by the reality of finiteness of computers etc, tends to add hazards. >> That's why floating-point arithmetic is cluttered with >> underflow/overflow/rounding etc, that does not occur with rational >> arithmetic. >> There has to be a tradeoff in utility vs deviation from mathematics >> for each such step. In my view >> Mathematica adds more steps and therefore more hazards. > This I might agree with, in a literal sense. But those hazards, in my view, are less than what we gain by departing from fixed precision arithmetic. That's because I rather like error control features. > > > (RJF) >> ... snip.. about Significance arithmetic. >> >> (DL) I think there may be some since then. No matter. We find it works well >> for what we need. apparentlyy >> >> Your view, roughly, is that we picked it up off the garbage heap of >> numerics literature. >> (RJF) no, I speculate that Wolfram took the idea that >> is taught in Physics 101 lab about carrying significant figures when >> you are combining measurements, and thought that it would be a great >> idea to put this into a programming language. He might even have >> thought that he invented it. (That is, a programming language with >> significance arithmetic). I truly doubt that Wolfram studied any >> serious numerical analysis literature, and I would certainly not >> expect him to study its garbage heap :) > This is a few years before my time, but I think it was more Jerry Keiper's idea than Stephen Wolfram's that Mathematica use significance arithmetic. > >> (DL) Perhaps so. In all the conferences I've attended, I've only once heard >> it disparaged (not derided, but disparaged) as a bad method. > (RJF) For >> disparagement, one need not look any further than the wikipedia >> article, at least last time I looked. > Well, there you have it. Right in print, on the Internet. My apologies. You can find disparagement in other places (with authors) too. The current article seems substantially less critical than some past versions. I wonder why :) > >> (RJF) I thought you were claiming that Mathematica had the lion's share of >> bigfloat usage. >> (DL) No. Though I suspect it does, once you get past the >> 200 digit mark or so. > (RJF) >> Past the 200 (decimal) digit mark, I think most >> usage would be in trying to prove/disprove number-theory conjectures, >> though there are some poorly formulated geometry problems that might >> require this. Unless Mathematica gives really >> direct access to libraries, it would not be a good choice. But I >> think it (eventually) calls GMP and such. > Computation at higher precision does show up in useful applications. One is in numeric solving a la Mathematic's NSolve (uses relatively high prec for intermediate computations). Another is, as you suggest, in handling of ill conditioned problems. I think there are more applications, using large approx bignums in lieu of even more unwieldy exact numbers. > > > (RJF) >> OK, if we are going to talk about the original >> post, the answer looks absurd. >> To remind the patient readers, if there are any left.. >> N[FractionalPart[(6 + Sqrt[2]) 20 ]] >> >> Result: >> >> -160. (Though I get -64). >> >> Why does this appear absurd? For a starter, that expression is >> non-negative, so the fractional part of it >> would have to be non-negative. Next, the fractional part p would have >> to obey the relation >> 0<=p<1.0 >> >> So the answer -160 is crazy, and FractionalPart would seem to be >> nonsense. >> Actually, it is not total nonsense, just somewhat silly. It is >> apparently computed as >> >> FractionPart[x]= x- IntegerPart[x] >> >> where IntegerPart works pretty hard to work for arbitrary expressions, >> in this case resulting in (6 + Sqrt[2])^20 - 251942729309018542 , >> which is correct if inconvenient. >> >> now it appears that the next part of the computation N[%] is the >> guilty party, and that >> it might say "uhoh, there are too many digits there for me to do that >> calculation >> in hardware float without catastrophic loss of accuracy .." but >> instead of saying >> that, it just burps out an answer. Doing N[%,3] invokes the software >> floats and >> indeed uses enough precision to get 3 digits right IN THE ANSWER, >> which is >> nice. Clearly 3 digits internally is insufficient. >> >> It seems to me that a cure for all this is to allow IntegerPart and >> FractionPart >> to operate only on explicit rational and float expressions. Any float >> expression >> that has more digits to the left of the decimal (actually binary) >> point >> than precision, is equal to a particular integer, and has zero >> fractional part. >> >> Among other advantages: the result can be explained. > Reasonable semantics, though not what I'd want to have in Mathematica (which, to reiterate, I think is doing what it ought in this example). This proposed handling could be something to suggest for another (present or future) language though. > > >> (RJF) My impression is that the actual primary use of Grobner basis >> computations is to write papers on Grobner basis computations, and the >> other "uses" are "in principle" "demonstrations". This may have >> changed with a more numerical approach, however. > (DL) >> Your impression is >> off insofar as industrial usage goes. For example, we use them in >> NSolve, exact equation solving, some polynomial algebra, and probably >> a few places I am forgetting. > (RJF) >> You are right. I should have expanded my statement to include .. >> "primary use of Grobner basis >> computations is to write papers AND PROGRAMS ...". >> Unfortunately, including GB in "NSolve" (etc.) does not, to me, >> really represent an application of GB. >> It represents a situation in which a well-intentioned programmer >> presents an opportunity for >> someone to use GB to solve a problem. > Interesting view. I had not considered this, but maybe you are right. > > > (RJF) >> With luck the program may be >> applied in simple >> circumstances that permit a simple solution. >> But a so-called application in "commutative algebra" does not >> represent an application. (Here, I am >> perhaps going out on a limb saying that I won't count "an application >> to pure mathematics" >> as a true application.) > I'm fine with that. But there have been uses of NSolve in non-math R&D settings. At least a handful that have been brought to my attention. And if it works fine, why would anyone bring it to my attention. > > > (RJF) >> The rather extensive wikipedia article on the topic lists many many >> applications, most of which are >> of the type I do not count, or are of the form "in principle we could >> do this with GB, and so it is >> an application" though in fact no one could ever realistically use GB >> for it except on exceedingly >> toy-like instances. (Occasionally one can let a problem run for a >> very long time if you only have >> to solve it one time, as in some robotics problems.) > Again, there are non-toy applications that use this technology. To what extent they represent the spectrum of desired uses I do not know. > > > (RJF) >> I have only limited personal experience with trying to use GB, on 2 >> kinds of problems, oil reservoir >> simulation and the inverse problem of imaging via CAT scan or >> similar. > These are interesting areas, but very likely not in the scope of what GB-based methods can handle. That does not make such methods a bad tool, just one that is not appropriate for these uses, or at least not in the formulation you tried (probably not usable in any formulation, if the problem size was spectacularly large: global equation solving has exponential dependency on dimension). > > > (RJF) >> A recent (2008) PhD thesis >> by M-LTorrente on Applications to the Oil Industry seems to show the >> situation has not changed. >> That is, casting a problem in terms of GB is unlikely to be a useful >> step in realistically >> solving that problem. This does not prevent the writing of a paper >> exploiting a vastly simplified formulation. > Nor does it mean that GB-based methods are generally bad. Just not a good tool for this class of problems. > > > (RJF) >> While the numbers may change somewhat, consider that you could solve a >> problem characterized >> by 6 or 10 parameters and 3 variables using GB. Increasing the number >> of variables may >> increase the computation cost exponentially. In practice there will >> be 10,000 parameters and >> 200 variables. >> >> I am not, incidentally, saying there is an alternative symbolic method >> to replace GB that is much faster. >> Instead, I am suggesting that if you can reframe and simplify your >> question as a numerical procedure, you stand >> a much better chance of getting a result. > We are in heated agreement here. > > Sometimes I think it is a Good Thing when a methodology completely fails to scale up, as it lets the person know it would be wise to reframe the problem. > > On the subject of monster polynomial systems, lemme tellyaz a story. Once upon a time I was on a conference program committee. A post-doc submtted an article about some very practical tools developed to find all real solutions (thousands) for a class of polynomial systems of modest dimension (10 or so) tht arose in a very applied field in which said post-doc was working. This problem was known to be beyond the scope of all existing polynomial system solver technologies. > > The submission was not passed my way, and was eventually rejected. This was primarily due to the really negative (read: outright caustic) report of an Esteemed Member of the Computer Algebra Community (one with Specific Authority in all things Groebner-based). Best I can tell, this young researcher, whose work was quite nice, has been absent from computer algebra venues and literature ever since. > > My point, which I readily admit is distant from the remarks that lead me into this segue, is this. When someone comes along with good ideas for practical problems of the sort you may have encountered, chances are that person will not get that work into any symbolic computation literature. Me, I think we took a big loss in that particular case. As I've heard similar complaints from others (and had similar experiences though perhaps of less consequence to the field at large) I think this may be a common occurrence. > > Daniel Lichtblau > Wolfram Research This last thought -- that the computer algebra community especially tramples newcomers, or applications, is hard to calibrate. The US National Science Foundation has, at least in the not-too-distant past, pointed out that computer science reviewers are generally much harsher on proposals than reviewers in other areas. This is a problem when someone at a higher level in NSF claims that a much higher percentage of proposal in (say) physics have high ratings. I think that applications of computer algebra to field X should be published if possible in a journal about X in order to plant a stake in the ground that computer algebra is good for X. I've published in Int'l J of Modern Physics-C, Celestial Mechanics, Particle Accelerators, and some applied math places. Running a huge program to do an X problem in Mathematica per se is not a contribution to Mathematica, though building tools in Mathematica to do problems in the field of X may very well be worthwhile as a contribution to computer algebra and Mathematica. As an example, LevinIntegrate is neat. Solving 200 integrals of interest to field X using LevinIntegrate might be a contribution to (say) gravitational lensing. I can imagine the review though in DL's story: the reviewer simply says "This could be solved more easily by using <some system known to reviewer>". In some instances this is a legitimate reason to reject a paper. Sometimes not. I personally have had papers rejected that way, and I sometimes agree. I think that harsh criticisms are not reserved for newcomers. As for why, I can only conjecture. Perhaps there is some alternative world view that comes into play when one has committed opinions into articles and even more so, into programs, and yet further, commercial programs. That is, if you/your students/ and especially your company have spent X person-years into doing something in a certain way, then someone who proposes a different way will be subjected to additional scrutiny (if not outright scorn :) :) . Someone who writes a paper for a peer reviewed journal (not a magazine dedicated to a particular computer algebra system), can expect any "system" oriented paper to be attacked by advocates of any system not explicitly compared. Similarly, but not as extreme for conferences. This is one, though not the only, reason that the Journal of Symbolic Computation cannot publish systems-y papers. (note for other readers, if there are any still there... that DL and RJF have both been on the editorial board of JSC at various times, and I think we both would push for more systems papers.) RJF