MathGroup Archive: March 2011 [00871]

[Date Index] [Thread Index] [Author Index]
Re: Why Mathematica does not issue a warning when the calculations
To: mathgroup at smc.vnet.net
Subject: [mg117654] Re: Why Mathematica does not issue a warning when the calculations
From: Daniel Lichtblau <danl at wolfram.com>
Date: Tue, 29 Mar 2011 06:54:23 -0500 (EST)
Richard, if you choose to follow up, please either use an email client that respects maintaining of line spacing or else manually insert line spaces. I've been spending time disentangling our various remarks. I send them out well spaced, only to see them elided in replies.


> (RJF) We've discussed this before. If you have a true function f, that
> is a procedure whose results depend only on its argument(s), then when
> A is equal to B, f(A) should be equal to f(B). This is fundamental to
> programming and to mathematics, but is violated by Mathematica.

> (DL) Probably by other mathematical programming languages as well.
> Especially in the operating world of finite precision arithmetic. It's
> something people often get used to (of necessity, if you work on this
> stuff).

> (RJF...[Why is it I need to reformat to show who wrote what?--dl]) 
> I disagree that this is a common problem, although it would
> probably not be
> allowed to discuss particular other mathematical programming languages
> in this forum.
> So far as I can tell, it has nothing to do with finite precision
> arithmetic unless you
> have some notion that "equality" means something like "close enough
> for government work".

Numeric conditioning combined with truncation or cancellation error can have the effect of messing up the proposed equivalence. You actually show that quite clearly below, in analyzing the Mathematica behavior of the original example at machine precision.

Significance arithmetic might exacerbate this in different cases, but the issue is quite present with fixed precision.


(RJF)
> In Mathematica  there are two kinds of equality:  ==  and === . The
> latter means IDENTICAL.
> Consider a=SetPrecision[2,40]    b=SetPrecision[2,50].   I could
> sympathize with  a==b  being
> true, but certainly not a===b, which says they are IDENTICAL, and in
> my book, f[a] must equal
> f[b] for all functions f.  Yet let f=Precision, and you see we get
> different answers. 

SameQ for numeric inputs has semantics as thus: if both are approximate, compare to last bit (or possibly next-to-last, I do not recall that specific detail). You may not care for those semantics, but
RJFDislike =!= SemanticsAreWrongOrEvenNecessarilyBad
(Just evaluate that in Mathematica if you don't believe me...)


(RJF)
> 2.000000000000000001000 == 2  is False, but
> 2.00000000000000000100  == 2  is True.
> 
> Such rules of the road lead to situations in which a==b  but a-b is
> not zero,

True. As you are aware from numerics literature, there will always be anomalies of that sort (and I also am not thrilled with that one). It is a matter of implementing semantics that we deem preferable from amongst suboptimal choices. Some undesirable features will manifest regardless. I think you know that.


(RJF)
> This is not the understanding in Mathematica, where (for software
> bigfloats) 2.000000000000000000 is more like shorthand for "all
> numbers between 1.9999999999999999995 2.0000000000000000005 " (DL may
> be willing and able to explain what Mathematica REALLY does)

> (DL, Once again inserting demarcation...)
> Something along the lines you suggest. Your alternative approach is fine for
> handling of integers. It does not work so well for inputs such as 1.3
> which do not translate exactly to finite binary representations. But
> Mathematica semantics would still be violated by upping precision, and
> that in no way contradicts the documentation remarks about ratcheting
> precision internally, as they refer to handling of exact input
> (N[2,10] et al are not exact, and they are what FractionalPart sees).

> (RJF, Flailing at a badly maimed equine)
> The simplest way of having consistent mathematical semantics in a
> programming language is to  have the programming language obey the
> rules of mathematical semantics.

>( DL interjecting)
Gimme a break. (That's what mathematicians say when computer scientists lecture them about the dark interstices between the fields).

(RJF, continuing)
> Each step away from this, as imposed
> by the reality of finiteness of computers etc, tends to add hazards.
> That's why floating-point arithmetic is cluttered with
> underflow/overflow/rounding  etc, that does not occur with rational
> arithmetic.
> There has to be a tradeoff in utility vs deviation from mathematics
> for each such step. In my view
> Mathematica adds more steps and therefore more hazards.

This I might agree with, in a literal sense. But those hazards, in my view, are less than what we gain by departing from fixed precision arithmetic. That's because I rather like error control features.


(RJF)
> ... snip..  about Significance arithmetic.
> 
> (DL) I think there may be some since then. No matter. We find it works well
> for what we need. apparentlyy
> 
> Your view, roughly, is that we picked it up off the garbage heap of
> numerics literature.

> (RJF) no, I speculate that Wolfram took the idea that
> is taught in Physics 101 lab about carrying significant figures when
> you are combining measurements, and thought that it would be a great
> idea to put this into a programming language.  He might even have
> thought that he invented it. (That is, a programming language with
> significance arithmetic).  I truly doubt that Wolfram studied any
> serious numerical analysis literature, and I would certainly not
> expect him to study its garbage heap :)

This is a few years before my time, but I think it was more Jerry Keiper's idea than Stephen Wolfram's that Mathematica use significance arithmetic.

> (DL) Perhaps so. In all the conferences I've attended, I've only once heard
> it disparaged (not derided, but disparaged) as a bad method.

(RJF) For
> disparagement, one need not look any further than the wikipedia
> article, at least last time I looked.

Well, there you have it. Right in print, on the Internet.


> (RJF) I thought you were claiming that Mathematica had the lion's share of
> bigfloat usage.

> (DL) No. Though I suspect it does, once you get past the
> 200 digit mark or so.

(RJF)
> Past the 200 (decimal) digit mark, I think most
> usage would be in trying to prove/disprove number-theory conjectures,
> though there are some poorly formulated geometry problems that might
> require this. Unless Mathematica gives really
> direct access to libraries, it would not be a good choice.  But I
> think it (eventually) calls GMP and such.

Computation at higher precision does show up in useful applications. One is in numeric solving a la Mathematic's NSolve (uses relatively high prec for intermediate computations). Another is, as you suggest, in handling of ill conditioned problems. I think there are more applications, using large approx bignums in lieu of even more unwieldy exact numbers.


(RJF)
> OK, if we are going to talk about the original
> post, the answer looks absurd.
> To remind the patient readers, if there are any left..
> N[FractionalPart[(6 + Sqrt[2]) 20 ]]
> 
> Result:
> 
> -160.   (Though I get -64).
> 
> Why  does this appear  absurd?  For a starter, that expression is
> non-negative, so the fractional part of it
> would have to be non-negative.  Next, the fractional part p would have
> to obey the relation
>  0<=p<1.0
> 
> So the answer -160 is crazy, and FractionalPart would seem to be
> nonsense. 
> Actually, it is not total nonsense, just somewhat silly.  It is
> apparently computed as
> 
>  FractionPart[x]= x- IntegerPart[x]
> 
> where IntegerPart works pretty hard to work for arbitrary expressions,
> in this case resulting in (6 + Sqrt[2])^20 - 251942729309018542 , 
> which is correct if  inconvenient.
> 
> now it appears that the next part of the computation  N[%] is the
> guilty party, and that
> it might say "uhoh,  there are too many digits there for me to do that
> calculation
> in hardware float without catastrophic loss of accuracy .."  but
> instead of saying
> that, it just burps out an answer.  Doing N[%,3] invokes the software
> floats and
> indeed uses enough precision to get 3 digits right IN THE ANSWER,
> which is
> nice. Clearly 3 digits internally is insufficient.
> 
> It seems to me that a cure for all this is to allow IntegerPart and
> FractionPart
> to operate only on explicit rational and float expressions.  Any float
> expression
> that has more digits to the left of the decimal (actually binary)
> point
>  than precision, is equal to a particular integer, and has zero
> fractional part.
> 
> Among other advantages: the result can be explained.

Reasonable semantics, though not what I'd want to have in Mathematica (which, to reiterate, I think is doing what it ought in this example). This proposed handling could be something to suggest for another (present or future) language though.


> (RJF) My impression is that the actual primary use of Grobner basis
> computations is to write papers on Grobner basis computations, and the
> other "uses" are "in principle" "demonstrations". This may have
> changed with a more numerical approach, however.

(DL)
> Your impression is
> off insofar as industrial usage goes. For example, we use them in
> NSolve, exact equation solving, some polynomial algebra, and probably
> a few places I am forgetting.

(RJF)
> You are right.  I should have expanded my statement to include ..
> "primary use of Grobner basis
> computations is to write papers AND PROGRAMS ...".
> Unfortunately, including GB in "NSolve" (etc.)  does not, to me,
> really represent an application of GB.
> It represents a situation in which a well-intentioned programmer
> presents an opportunity for
> someone to use GB to solve a problem.

Interesting view. I had not considered this, but maybe you are right.


(RJF)
>  With luck the program may be
> applied in simple
> circumstances that permit a simple solution.
> But a so-called application in "commutative algebra" does not
> represent an application.  (Here, I am
> perhaps going out on a limb saying that I won't count "an application
> to pure mathematics"
> as a true application.)

I'm fine with that. But there have been uses of NSolve in non-math R&D settings. At least a handful that have been brought to my attention. And if it works fine, why would anyone bring it to my attention.


(RJF)
> The rather extensive wikipedia article on the topic lists many many
> applications, most of which are
> of the type I do not count, or are of the form "in principle we could
> do this with GB, and so it is
> an application"  though in fact no one could ever realistically use GB
> for it except on exceedingly
> toy-like instances.  (Occasionally one can let a problem run for a
> very long time if you only have
> to solve it one time, as in some robotics problems.)

Again, there are non-toy applications that use this technology. To what extent they represent the spectrum of desired uses I do not know.


(RJF)
> I have only limited personal experience with trying to use GB, on 2
> kinds of problems, oil reservoir
> simulation  and the inverse problem of imaging via CAT scan or
> similar. 

These are interesting areas, but very likely not in the scope of what GB-based methods can handle. That does not make such methods a bad tool, just one that is not appropriate for these uses, or at least not in the formulation you tried (probably not usable in any formulation, if the problem size was spectacularly large: global equation solving has exponential dependency on dimension).


(RJF)
> A recent (2008)  PhD thesis
> by M-LTorrente on Applications to the Oil Industry seems to show the
> situation has not changed.
> That is, casting a problem in terms of GB is unlikely to be a useful
> step in realistically
> solving that problem.  This does not prevent the writing of a paper
> exploiting a vastly simplified formulation.

Nor does it mean that GB-based methods are generally bad. Just not a good tool for this class of problems.


(RJF)
> While the numbers may change somewhat, consider that you could solve a
> problem characterized
> by 6 or 10 parameters and 3 variables using GB. Increasing the number
> of variables may
> increase the computation cost exponentially.  In practice there will
> be 10,000 parameters and
> 200 variables.
> 
> I am not, incidentally, saying there is an alternative symbolic method
> to replace GB that is much faster.
> Instead, I am suggesting that if you can reframe and simplify your
> question as a numerical procedure, you stand
> a much better chance of getting a result.  

We are in heated agreement here.

Sometimes I think it is a Good Thing when a methodology completely fails to scale up, as it lets the person know it would be wise to reframe the problem.

On the subject of monster polynomial systems, lemme tellyaz a story. Once upon a time I was on a conference program committee. A post-doc submtted an article about some very practical tools developed to find all real solutions (thousands) for a class of polynomial systems of modest dimension (10 or so) tht arose in a very applied field in which said post-doc was working. This problem was known to be beyond the scope of all existing polynomial system solver technologies.

The submission was not passed my way, and was eventually rejected. This was primarily due to the really negative (read: outright caustic) report of an Esteemed Member of the Computer Algebra Community (one with Specific Authority in all things Groebner-based). Best I can tell, this young researcher, whose work was quite nice, has been absent from computer algebra venues and literature ever since.

My point, which I readily admit is distant from the remarks that lead me into this segue, is this. When someone comes along with good ideas for practical problems of the sort you may have encountered, chances are that person will not get that work into any symbolic computation literature. Me, I think we took a big loss in that particular case. As I've heard similar complaints from others (and had similar experiences though perhaps of less consequence to the field at large) I think this may be a common occurrence.

Daniel Lichtblau
Wolfram Research
Prev by Date: Re: Why Mathematica does not issue a warning when the calculations
Next by Date: no need to compile a tuned libgmp under Linux
Previous by thread: Re: Why Mathematica does not issue a warning when the calculations
Next by thread: Re: Why Mathematica does not issue a warning when the calculations