Re: Numerical accuracy/precision - this is a bug or a feature?
- To: mathgroup at smc.vnet.net
- Subject: [mg120277] Re: Numerical accuracy/precision - this is a bug or a feature?
- From: Andrzej Kozlowski <akoz at mimuw.edu.pl>
- Date: Sat, 16 Jul 2011 05:42:00 -0400 (EDT)
- References: <ius5op$2g7$1@smc.vnet.net> <ius7b6$30t$1@smc.vnet.net> <iv9ehj$dct$1@smc.vnet.net> <ivjgfp$2b9$1@smc.vnet.net> <201107140921.FAA15620@smc.vnet.net> <D0766879-A410-445C-96D1-0A946AEB43EC@mimuw.edu.pl> <4E1F2DA7.1060302@eecs.berkeley.edu> <C364D57F-0F11-4D40-8B83-F7623F814647@mimuw.edu.pl> <4E20537F.7090503@eecs.berkeley.edu>
On 15 Jul 2011, at 16:49, Richard Fateman wrote: > On 7/14/2011 11:55 PM, Andrzej Kozlowski wrote: > > Gee, Andrzej, all I can think of is the childhood playground chant (I don't know where you might have been a child, so this may not > bring back memories...) > > "I'm rubber, you're glue; everything You say sticks to YOU!" Yes, I can also think of a few playground chants that would apply nicely, but unfortunately you would not understand them. So let's have something English instead, like "Sticks and stones...". Now, I really have no time or patience to go over all your misrepresentations in detail. But let me point out (or remind you perhaps) that the already mentioned paper of Sofroniou and Spaletta makes the following remark: > The choice of significance arithmetic as the default in Mathematica has not been universally favorable (see for example [9]) although some of these criticisms relate to early deficiencies in the implementation. Reference [9] is, of course: [9] R.J. Fateman, A review of Mathematica, J. Symbolic Comput. 13 (1992) 545=96579. In fact the "early" version was much more like "significant digits convention", while the current version is quite a lot more sophisticated, as described in the article. Somehow you don't seem to have noticed it since that is the "early version" that you always seem to be describing. (Significant digits is something that used to be taught in primary school, if I recall correctly). As for difference between theoretical analysis of the propagation of round-off error and practical scientific computations, I am well aware of it, but was it not you who gleefully quoted the statement: The technique of propagating the uncertainty from step to step throughout the calculation is a very bad technique. It might sometimes work for super-simple textbook problems but it is unlikely to work for real-world problems. So if agree with this (as you seem to), are you implying that the entire chapter 16 of Henrici's book is concerned with something that only applies to "super-simple text book problems"? The point is, of course, that you never make it clearly what exactly you are talking about probably because you find it convenient for rhetorical purposes (90% of these discussions are really just rhetorics). I should add that I myself long ago and more then once wrote on this forum that I do not think Mathematica's "approximate numbers" are a good model for empirical errors, but they are very good tools for solving certain purely mathematical problems. Often it can be accomplished in 100% provably correct way (barring bugs of course). As for things like numerical Groebner basis etc, you keep arguing that it probably can be done in some other way (without significance arithmetic) and in a certain, rather trivial, sense you undoubtedly right. But the fact is, as Daniel has pointed out, that, in practice, nobody seems to have done it in another way, and not for want of trying. So again here we have the distinction between "theory" and "practical application" but except that this time the shoe is on the other foot. Andrzej >> You are clearly confusing significant digit arithmetic, which is not what Mathematica uses, with significance arithmetic, which is a first order approximation to interval arithmetic or distribution based approach. > > I view significant digit arithmetic, if by digit you allow binary digit or "bit", to be a crude version of interval arithmetic where the endpoints of the interval are not numbers of (essentially) arbitrary precision, but where the endpoints must be represented by the central point plus or minus some power of two. It is an approximation to interval arithmetic. I think it is a reasonable idea to restrict the = number of digits carried in the value of the interval endpoints (making = them "cruder" than you might really know) in the interests of efficiency = in time and space. Thus using properly rounded machine floats is = appropriate, at least when the range is suitable and the difference = between upper and lower bounds is large. Using just one bit makes the = width of the interval grow faster than using more bits... > > Since you do not understand the context of the paragraphs you have = quoted, nor do you seem to acknowledge the difference between = theoretical analysis of the propagation of round-off error in = well-chosen standard examples, and somehow running interval arithmetic = all the time, it is hard to know how to educate you. How comforting is = your proud quote, below? Mathematica's arithmetic is "not ... = completely reliable". Rasputinov says, > further, "it is not almost always wrong". I'm not sure why he says = the significant digit convention is almost always wrong unless he means = almost always too pessimistic. I'm also not aware of why Rasputinov's = opinion should sway the discussion especially (sorry Oleksandr). > > > My point remains: > WRI could have a much more direct notion of number, and equality, that = would make it easy to implement my choice of arithmetic or theirs or = yours. > They didn't. The default is not completely reliable. People write to = this newsgroup, periodically, saying Huh? what's going on? I found a = bug! > > As for the experimental and computational notions of (im)precision, I = think there is an issue of using the same words with different meanings. = Similar but unfortunately different. Precision of a floating point = number F is simply the number of bits in the fraction. If you impute = some uncertainty to F, you can store that in another piece of data D in = the computer. If you want to assert that F and D together represent a = distribution of a certain kind you can also compute with that. > > RJF > >> Obviously you don't read the posts you reply to and confuse both the = posters and the contents of what they post. Here is a quote from = Oleksandr Rasputionov that makes this completely clear: >> >> >> Unfortunately so, given that it is severely erroneous: see e.g. >> <http://www.av8n.com/physics/uncertainty.htm>. However, Mathematica's >> approximation of how these uncertainties propagate is first-order, = not >> zeroth-order. This does not make it completely reliable, of course, = but >> certainly it is not almost always wrong as is the significant digits >> convention. Within the bounds of its own applicability, Mathematica's >> approximation is reasonable, although it would still be a mistake to = apply >> it to experimental uncertainty analysis given the much broader scope = of >> the latter. >> >> Note the "first-order not zeroth-order". Also, do take a look at = Sofroniou and Spaletta and then you may perhaps understand what "order" = means and how "significance arithmetic" differs from the "significant = digits" convention. Good grief, did you ever learn about the Taylor = series? It must have been a long time ago, I take. >> >> >> Andrzej Kozlowski >> >> >> On 14 Jul 2011, at 19:55, Richard Fateman wrote: >> >>> On 7/14/2011 6:27 AM, Andrzej Kozlowski wrote: >>>> On 14 Jul 2011, at 11:21, Richard Fateman wrote: >>>> >>>>> On 7/13/2011 12:11 AM, Noqsi wrote: >>>>> .. >>>>> >>>>>>> see e.g. >>>>>>> <http://www.av8n.com/physics/uncertainty.htm>. >>>>> .. >>>>> Learning mathematics from a physicist is hazardous. >>>>> Learning computer science from a physicist is hazardous too. >>>>> Numbers in a computer are different from experimental = measurements. >>>>> >>>>> >>>>> nevertheless, I like this article. It says, among other things, >>>>> >>>>> The technique of propagating the uncertainty from step to step >>>>> throughout the calculation is a very bad technique. It might = sometimes >>>>> work for super-simple =84textbook=89 problems but it is unlikely = to work for >>>>> real-world problems. >>>>> >>>> Well, here is a quote from a very well known book on numerical = analysis by a mathematician (Henrici, "Elements of Numerical Analysis"). >>>> >>>> >>>>> It is plain that, on a given machine and for a given problem, the = local >>>>> rounding errors are not, in fact, random variables. If the same = problem >>>>> is run on the same machine a number of times, there will result = always the >>>>> same local rounding errors, and therefore also the same = accumulated >>>>> error. We may, however, adopt a stochastic model of the = propagation of >>>>> rounding error, where the local errors are treated as if they were = random >>>>> variables. This stochastic model has been applied in the = literature to a >>>>> number of different numerical problems and has produced results = that are >>>>> in complete agreement with experimentally observed results in = several >>>>> important cases. >>>> The book then describes the statistical method of error propagation = of which Mathematica's approach can be regarded as a first order = approximation (as pointed out by Oleksandr Rasputinov, who should not be = confused with the OP of this thread so: >>> So are we to conclude that Henrici recommends this as a general = numerical computational method? >>> >>> I don't see that here. I think what he is saying is that if you do = some mathematics (see below), then >>> you will get results consistent with what you will get if you = actually run the experiment on the computer. >>> This is not surprising. It is a result that says that "theoretical" = numerical analysis agrees with >>> "computer experiments" in arithmetic. It doesn't say Henrici = recommends running a computation this way. >>> >>> When Henrici says "adopt a stochastic model ...." he doesn't mean = to write a program. He means to >>> think about each operation like this.. (I show for multiplication = of numbers P and Q with errors a b resp.) >>> >>> P*(1+a) times Q*(1+b) = P*Q *(1+a)*(1+b)* (1+c) where c is a = new "error" bounded by roundoff, e.g. half unit in last place. >>> >>> For each operation in the calculation, make up another error = letter... a,b,c,d,e,f,g... >>> assume they are uncorrelated. >>> >>> The fact that this theoretical approach and numerically running = "several important cases" is a statement >>> about correlation of roundoff in these cases, not a statement of = advisability of whatever for a model of >>> how to write a computer system. >>> >>> By the way, I think that Henrici was an extremely fine theoretical = numerical analyst, and a fine writer too. >>> >>> >>> >>>>> Clearly Rasputinov thinks that if they are not equal they should = not be >>>>> Equal. Thus the answer is False. >>>> is *clearly* False. In fact Oleksander expressed something closer = to the opposite view.) >>> This thread is too long. I don't know at this point if you are = agreeing that it is false or contradicting that "is False" is false. >>>> And if one quote is not enough, here is another, from another text = on numerical analysis. (Conte, de Boor, ""Elementary Numerical = Analysis). >>>> It describes 4 approaches to error analysis, interval arithmetic, = significant-digit arithmetic, the "statistical approach" and backward = error analysis. Here is what it says about the second and the third one: >>> Huh, if we are talking about the second and third one, why does he = say third and fourth? >>> Are you using 0-based indexing and deBoor is using 1-based = indexing???? >>> >>> >>>>> A third approach is significant-digit arithmetic. As pointed out = earlier, whenever two nearly equal machine numbers are subtracted, there = is a danger that some significant digits will be lost. In = significant-digit arithmetic an attempt is made to keep track of digits = so lost. In one version >>>>> only the significant digits in any number are retained, all others = being discarded. At the end of a computation we will thus be assured = that all digits retained are significant. The main objection to this = method is that some information is lost whenever digits are discarded, = and that the results obtained are likely to be much too conservative. = Experimentation with this technique is still going on, although the = experience to date is not too promising. >>>>> >>> OK, so deBoor (who is retired and therefore not likely to revise the = "experience to date") says this method "is not too promising". >>> This sounds to me like he is not endorsing what Mathematica does. >>> >>>>> A fourth method which gives considerable promise of providing an = adequate mathematical theory of round-off-error propagation is based on = a statistical approach. It begins with the assumption that round-off = errors are independent. This assumption is, of course, not valid, = because if the same problem is run on the same machine several times, = the answers will always be the same. We can, however, adopt a stochastic = model of the propagation of round-off errors in which the local errors = are treated as if they were random variables. Thus we can assume that = the local round-off errors are either uniformly or normally distributed = between their extreme values. Using statistical methods, we can then = obtain the standard devia- tion, the variance of distribution, and = estimates of the accumulated round- off error. The statistical approach = is considered in some detail by Ham- ming [1] and Henrici [2]. The = method does involve substantial analysis and additional computer time, = but in the experiments conducted to date it has obtained error estimates = which are in remarkable agreement with experimentally available = evidence. >>> deBoor is essentially quoting Henrici, and this statistical approach = is to say that all those error terms I mentioned above, a,b,c,d,e,f,... >>> can be chosen from some distribution (the way I've written it, a = ,....,z .... would essentially be chosen from {-u,u} where u = = 2^(-W) where the fraction part of the floating-point number is W bits. = ) and you can compute the final expression as = ANSWER+<somehorrendousfunctionof>(a,b,c,....). >>> >>> What deBoor says is that this (theoretical numerical analysis) = "method" promises to provide >>> a theory of round-off error propagation. He is not saying this is = a practical method for scientific computing. When he uses the work = "method" >>> he means a mathematical method for analyzing roundoff. >>> >>> In any case, Mathematica does not do this. I would further argue = that Mathematica makes it hard to carry out the experiments that might = be done to demonstrate that this theory applied in any particular = sample computation. >>>> The fundamental paper of Mathematica's error propagation is = "Precise numerical computation" by Mark Sofroniou and Giulia Spaletta = in The Journal of Logic and Algebraic Programming 64 (2005) 113=96134. = This paper describes Mathematica's "significance arithmetic" as a first = order approximation to Interval Arithmetic. It makes no mention of = distributions. >>> I thought I read this paper in some Mathematica documentation or = conference proceedings. >>>> Oleksandr Rasputionov, in an earlier post here, interpreted = "significance arithmetic" as a first order approximation to the fourth = method above. >>> Huh? First of all, the original poster was slawek. Rasputinov seems = to think that Mathematica numbers are like Intervals (basically a good = intuition until you think about equality.) and refers to them as = distributions. This is not deBoor's 4th "method" of theoretically = analyzing round-off. >>> In fact it is deBoor's 1st method, interval arithmetic. This has = the advantage of being maybe 4 to 8 times slower than regular = arithmetic, and also has a huge literature (see "Reliable Computation") = describing variations, advantages, disadvantages, etc. >>> >>>> I have not considered this very carefully, but it seems pretty = clear that he is right, and that the two "first order" approximations = are in fact isomorphic. >>> Uh, this is unclear, unless you mean that Mathematica's number = system is essentially interval arithmetic, but with a confusing front = end. >>>> The first order approach is, of course, justified on grounds of = performance. It is perfectly "rigorous" in the same sense as any "first = order" approach is (i.e. taking a linear approximation to the Taylor = series of a non-linear function). It works fine under certain conditions = and will produce nonsense when these conditions do not hold. >>>> >>>> The fact that significance arithmetic is "useful" needs no = justification other than the fact that it is used successfully by NSolve = and Reduce in achieving validated symbolic results by numerical methods = which are vastly faster than purely symbolic ones. >>> Really? >>> >>> 1. This does not mean that other methods, e.g. validated methods for = accurately evaluating polynomials (etc) NOT based on significance = arithmetic would not be faster and better. DanL claims it is used and = useful there, so that's nice. BUT.. >>> >>> 2. This does not mean that this arithmetic should be used by default = by user arithmetic. >>>> It is also useful, for users such as myself, who sometimes need = fast first order error anlysis. >>> OK, If you find it useful yourself, fine. I daresay you are not a = typical user. >>> >>>> I have lots of posts by Richard on this topic (or perhaps it was = the same post lots of time, it's so hard to tell), but I have never = understood what his main point is. >>> Main points: Mathematica's number system is non-standard, peculiar, = hard to explain, capable of returning mysterious non-answers without = indications of error to innocent users, a bad foundation for building = higher-level functionality. >>> >>> I have other criticisms of Mathematica, but I think that the = sentence above is enough for you to process today. >>> >>>> It seems to me that is because he himself has not yet decided = this, although he has been posting on this topic for over 20 years (I = think). >>>> Sometimes he seems to be disparaging significance arithmetic = itself. >>> I think deBoor did that. >>> >>>> When Daniel points out how effective it is in his implementation = of numerical Groebner basis, or in Adam Strzebonski's work on Reduce, he = either ignores this altogether or claims that Groebner bases, etc. are = themselves not "useful". >>> I think Reduce is a very nice program when it works. If I am not = mistaken, all work on numerical Groebner basis should be reducible to = the evaluation of polynomials, for which there are faster and more = accurate methods available not using significance arithmetic. On the = other hand, I might be mischaracterizing DanL work, since I admit to not = having studied it. >>> >>> >>>> On other occasions he takes on the role of the defender of the = interest of the "naive user" (presumably like the OP, who however would = be better described as "intentionally naive") who is going to be = confused by the "quirky" nature of significance arithmetic (at low = precision). >>> No, I don't think that slawek was a "troll". I think he was = genuinely confused, as might anyone be who has some prior computer = arithmetic exposure and for the first time encounters a system with = exact rational arithmetic. >>>> In doing so he conveniently ignores that fact of the existence of = thousands of "naive users" who never become confused (sometimes because = they always work with machine precision numbers and only use = significance arrhythmic unknowingly, e.g. when using Reduce). >>> Most naive users don't tread near the dangerous spots. Some naive = users never notice that their answers are nonsense. >>> Every so often we get a naive user who DOES notice, and he/she sends = email to this newsgroup. >>>> And moreover, for those who do find a need for some sort of error = analysis he offers no alternative, except perhaps to learn backward = error analysis. >>> Actually, there's a whole bunch of libraries of methods with error = estimates for common computations, where backward error analysis or some = other method was used to provide extra information. >>> >>>> Except, of course, that should they do so they would no longer be = "naive" and thus outside of Richard's area of concern. >>> They would probably not be writing in Mathematica. >>>> And in any case, anyone who needs and understands backward error = analysis can use it now, and can't imagine that even Richard would claim = that reducing users' options is a good thing. >>> I have no problem with Mathematica providing as an option, some = other arithmetic. It is WRI that has made it rather hard for the user = to >>> figure out how to "use the arithmetic of the rest of the world". >>> >>>> Finally, perhaps all that Richard is so upset about is simply = Mathematica's habit of defining numbers as "fuzz balls" or = "distributions". >>> I'm not sure that "upset" is the right term. After all, I don't = have to use Mathematica for numerical computation. And I rarely do. >>>> In other words, if Mathematica used a more usual "definition" of = number, and used significance arithmetic for "error" propagation, that = would be done by applying some separate function or turning on an = option, than everything would be fine? >>> Actually, that's pretty close to correct. >>>> If that is all, than it seems to me that Richard has for years been = making mountains of molehills. >>> So why are you (and WRI) so opposed to this notion? Note that it = would also have to affect other parts of the system including of course, >>> Equal and friends. >>> >>> >>>> Andrzej Kozlowski >>>> >>>> >>>> >>>> >>> > >
- References:
- Re: Numerical accuracy/precision - this is a bug or a feature?
- From: Richard Fateman <fateman@cs.berkeley.edu>
- Re: Numerical accuracy/precision - this is a bug or a feature?