Re: Root Finding Methods Gaurenteed to Find All Root Between (xmin, xmax)

*To*: mathgroup at smc.vnet.net*Subject*: [mg112749] Re: Root Finding Methods Gaurenteed to Find All Root Between (xmin, xmax)*From*: Daniel Lichtblau <danl at wolfram.com>*Date*: Wed, 29 Sep 2010 04:13:54 -0400 (EDT)

[Finally got sick of subject misspelling; apologies if this messes up message threading. --dl] Ingolf Dahl wrote: > It would be very good if the Semenov method can be transferred into a > program that it able to find all roots to multi-dimensional non-linear > equation systems. I have now also looked a little into this, but now have > three questions/statements: > > 1. The test T1 checks if the rectangle is free of roots. This is implemen= ted > by comparing the center function values in the rectangle with upper limit= s > for the spatial derivatives times half the length of the rectangle. The > upper limits of the spatial derivatives are calculated by interval analys= is. > Why is it done in that way? Why not apply interval analysis directly on t= he > functions as such? If this derivative method gives better estimates than = an > direct interval analysis, then I think it would be a good idea to > incorporate such a formula in the interval analysis of functions. Short answer is you are on target. In more detail, this is a case where the underlying math is fine but the implementation might be sloppy. That is to say, if one had a simple way to get tight bounds on the derivatives without resorting to interval methods, then that is what one should do. Using intervals to bound the derivatives, and hence the function values, makes not so much sense because one could (as you observe) just use intervals to bound the functions to begin with. There is a different but related situation in which it might pay to do both. This is when one looks for global optima. In such cases you want intervals where function value might be small, and moreover derivatives might be zero. Stan Wagon used this idea in a chapter for a SIAM book about Nick Trefethen's 100 digits challenge of several years ago. (I think it was problem 4, if memory serves.) > > 2. The corresponding question is relevant for the T2 test. Here the task > seems to be to check if the Jacobian determinant is non-zero, and this te= st > is done by comparing the center value in the rectangle with upper limits = for > the spatial derivatives of the Jacobian determinant times half the length= of > the rectangle. The upper limits of the spatial derivatives are calculated= by > interval analysis. Why is it done in that way? Why not apply interval > analysis directly on the Jacobian determinant as such? Then the program d= oes > not need to evaluate the second-order derivatives of the functions in the > equations. Same answer. But with a twist. As best I can tell, doing the bounding in this (overly conservative) manner makes it very unlikely that one will get an incorrect result. Indeed, if you compute derivatives symbolically, then substitute intervals, then evaluate a determinant, you won't get a failure. This is because such a numerical test will in fact bound what Andrzej refers to as a generalized determinant (more on that below). > 3. The reason to check if the Jacobian determinant is non-zero is that th= is > is said to check if there is at most a single root in the rectangle. > However, I looked in my textbook from my first year at the university > ("Matematisk analys II" by Hylt=E9n-Cavallius and Sandgren, Lund 1965, in > Swedish), and found discussion about this and a counterexample. Yes, this was the part that was in error. > The authors > emphasize that a non-zero Jacobian implies that the function values aroun= d a > point are unique in an environment, which is small enough, but we might h= ave > several identical function values in a larger area. Then we have several > roots to the corresponding equation. I will repeat the counterexample her= e, > slightly modified, since I think that it is of interest to several of the > readers of MathGroup. > Suppose that we have -1/4 <== x<== 1/4 and 0 <== y <== 5*Pi/2 and the equ= ations > > Exp[x]*Sin[y] - 1/Sqrt[2] ==== 0 > Exp[x]*Cos[y] - 1/Sqrt[2] ==== 0 > > Then the Jacobian is Exp[2x], which is larger than zero. The rectangle al= so > passes Test2, in spite of the two roots {x -> 0, y -> Pi/4} and {x -> 0, = y > -> 9 Pi/4}. Maybe this is the error Daniel Lichtblau has noticed, but > anyhow, the paper by Semenov seems now a bit less impressive. And a plus > score for the mathematicians of the previous century. I used the function of a complex variable z*(z-1)*(1+I*z). Then split into x and y terms. ff == z*(z-1)*(1+I*z) /. z->x+I*y; {u,v} == ComplexExpand[{Re[ff],Im[ff]}]; This gives {u,v} a function of {x,y}, so we go from R^2 to R^2. There are roots at {0,0} and {1,0}. Say we restrict interest to the segment between them. jac == {{D[u,x],D[u,y]}, {D[v,x],D[v,y]}}; Solve[{Det[jac]====0,0<==x<==1,y====0}, {x,y}] Out[132]== {} This means the test will claim there is at most one root in that interval. > We thus need a better > test, and a valid mathematical proof of it, to be able to search for all > roots in a guaranteed way. My feeling is that one might test the elements= of > the Jacobian matrix - if no one of them changes sign within the rectangle > (or hyper rectangle), and if the Jacobian determinant is non-zero of one > sign, then I guess that multiple roots are impossible. But that I have no= t > proved, and there might be better tests. Any suggestions? And what is mea= nt > by "generalized jacobian" in this context? > > Best regards > > Ingolf Dahl > ingolf.dahl at telia.com > [...] I see Andrzej has responded to the question of what is (in his terminology) a generalized determinant. Let me elaborate a bit. We assume we are given a region of interest. We also have a function from R^n to R^n. Formulate the Jacobian matrix of partial derivatives. For an ordinary Jacobian, one now evaluates these at a point and takes the determinant. For the generalized case, we allow for use of n points anywhere in the region in question, evaluating the first column (that is to say, the gradient of the first of the component functions) at the first point, second column at second point, etc. This suffices to extend the standard first order Taylor formula to functions with n components. Daniel Lichtblau Wolfram Research