MathGroup Archive: August 2004 [00184]

[Date Index] [Thread Index] [Author Index]
Re: Re: Re: 'NonlinearFit` confusion
To: mathgroup at smc.vnet.net
Subject: [mg49980] Re: [mg49975] Re: [mg49895] Re: [mg49844] 'NonlinearFit` confusion
From: DrBob <drbob at bigfoot.com>
Date: Mon, 9 Aug 2004 04:29:18 -0400 (EDT)
References: <200408041446.KAA20105@smc.vnet.net> <200408051321.JAA05885@smc.vnet.net> <6.1.2.0.1.20040807114330.04939518@pop.hfx.eastlink.ca> <Pine.OSX.4.58.0408071231350.28763@heisenberg> <6.1.2.0.1.20040807151006.049bd348@pop.hfx.eastlink.ca> <200408080938.FAA21936@smc.vnet.net>
Reply-to: drbob at bigfoot.com
Sender: owner-wri-mathgroup at wolfram.com
>> In any case, let's go back to the original problem for the moment, bearing
>> in mind that I'm not interested in what might or might not happen, but
>> given a problem, what's the solution?.

Not all problems HAVE solutions. Realistic problems, in fact, NEVER have good, efficiently identifiable solutions. We spend all our time, in mathematics, solving idealized problems that (hopefully) capture enough of the real problem to be useful. But even the idealized problems are often too hard.

>> Even with n>1 dimensions as you say, if there is only one valley
>> and you roll down the hill you will end up in the valley!!

How do you know how many valleys are in the problem?

It's a four-dimensional space of parameters (a, c, d, e), and each combination of values is used to define a function over a fifth dimension, which you're trying to fit to the data. That fit metric, in itself, is a surrogate for what you really want--a good fit--the characteristics of which often don't simplify to a single number that well.

To count hills and valleys, you might take the metric that's being minimized, compute the gradient w.r.t. {a,c,d,e}, and ponder how many zeroes there are. Several, I'll bet. A gradient search method can converge upon any of the local minima, depending on what initial values are used. Solving the Kuhn-Tucker conditions likewise gives a lot of solutions, some of which will be better than others.

To do better, we'd have to prove the problem is convex, so that there's only one local minimum. I seriously doubt that's the case in this problem.

Bobby

On Sun, 8 Aug 2004 05:38:05 -0400 (EDT), Yasvir Tesiram <tesiramy at omrf.ouhsc.edu> wrote:

> Hi,
> Yes OK, I agree with your last sentence, but I haven't skipped any data
> points. I have only used what's relevant. On the other hand you have now
> introduced n>1, hills, valleys, a ball, friction and gravity. This sounds
> like an enjoyable discussion but I'd prefer to do that over a beer or two.
> In any case, let's go back to the original problem for the moment, bearing
> in mind that I'm not interested in what might or might not happen, but
> given a problem, what's the solution?.
> I would like to see FindFit come to a solution with so many
> points as posed in the original problem. A computer is a fantastic GIGO
> device. I had a short go, using Simulated Annealing, NelderMead methods,
> Gradien etc, but came to realise that the problem as posed needed some
> thought. What am I fitting to?! Again, the point here remains. I don't
> need to make FindFit start somehwre in the middle. The first point is a
> good starting point, so as long as the number of points used to fit to the
> function are representative of the function! I haven't left out data
> points. I have simply chosen what is physically relevant and used a
> fitting algorithm to fit model function to the data. The
> Levenberg-Marquardt algorithm is good enough. And it leads to the expected
> solution. Even with n>1 dimensions as you say, if there is only one valley
> and you roll down the hill you will end up in the valley!! You don't need
> a million points to work out that you are in the valley. There is of
> course more to it than that, and of course some common sense must come
> into play some time.
>
> Cheers
> Have a good one
> Yas
>
>
>
>
> On Sat, 7 Aug 2004, Janos D. Pinter wrote:
>
>>
>> Yas,
>>
>> in everyday terms, multi-extremality can be visualized by a 'hilly' region
>> where different starting points will lead to different valleys. (Imagine a
>> rolling ball started from different points in this region, assuming gravity
>> and friction). FindFit will end up at the right solution only if started in
>> the right valley (region of attraction). In n>1 dimensions it is a problem
>> to take a good look at the error fct, for most of us. :-)
>>
>> To skip points could (will) bring in another element of subjectivity,
>> leading to point subset-dependent results.
>>
>> Cheers,
>>
>> Janos Pinter
>> www.pinterconsulting.com
>>
>>
>>
>> At 02:47 PM 8/7/2004, you wrote:
>> >G'day,
>> >Yes, but I have no idea what multi-extremality means in English. You can
>> >use FindFit as well. But here, one just needs to take a good look at the
>> >model function to quickly realise where the sample points should be and
>> >how many.
>> >It is always an assumption by most that when fitting data that more
>> >points are better. And its usually forgotten that all you need
>> >are sufficient data points to smoothly sample the model function.
>> >Obviously in this case the model function can be smoothly sampled through
>> >a single period. Perhaps that in itself is too much.
>> >I think Dr Bob has already pointed out something about fitting, non-exact
>> >and science. Which leaves the physical intepretation of the function used
>> >to model the data!
>> >You may have also noticed that I didn't skip data points. I simple took
>> >the first 40 or so.
>> >
>> >Best Regards
>> >Yas
>> >
>> >
>> >  On Sat, 7 Aug 2004, Janos D. Pinter wrote:
>> >
>> > >
>> > > Yas,
>> > >
>> > > skipping some of the data will not eliminate the potential
>> > > multi-extremality of a nonlinear model-fitting problem (but can make the
>> > > solution less stable). As Paul Abbott noted, either one needs a
>> > > 'sufficiently good' starting point for successful local
>> > > fitting/optimization, or needs to apply global optimization methods to find
>> > > the best numerical fit.
>> > >
>> > > Regards,
>> > > Janos Pinter
>> > >
>> > >
>> > > At 10:21 AM 8/5/2004, you wrote:
>> > > >Hi,
>> > > >
>> > > >Why not take less points, rather than more??
>> > > >
>> > > >In[480]:=
>> > > >Clear[datay,datax,data1]
>> > > >datay = Table [6 + 2Sin[3 + x], {x, -10Pi, 10Pi, 0.05}];
>> > > >datax = Table [x, {x, -10Pi, 10Pi, 0.05}];
>> > > >data1 = Table[{datax[[i]], datay[[i]]}, {i, 1, Length[datay]}];
>> > > >Length[data1]
>> > > >First[data1]
>> > > >
>> > > >Out[484]=
>> > > >1257
>> > > >
>> > > >Out[485]=
>> > > >{-31.4159,6.28224}
>> > > >
>> > > >In[511]:=
>> > > >Clear[a,b,c,d,e]
>> > > >sinFit1=NonlinearFit[Take[data1,40], c + a *Sin[d + e *x], x, {a, c,
>> > > >d,e}]
>> > > >
>> > > >Out[512]=
>> > > >6.\[InvisibleSpace]+2. Sin[3.\[InvisibleSpace]+1. x]
>> > > >
>> > > >
>> > > >In[513]:=
>> > > >Clear[y,datay,datax,a,b,c,d,e]
>> > > >datay = Table [6 + 2Sin[3 + 5 x], {x, -10Pi, 10Pi, 0.05}];
>> > > >datax = Table [x, {x, -10Pi, 10Pi, 0.05}];
>> > > >data2 = Table[{datax[[i]], datay[[i]]}, {i, 1, Length[datay]}];
>> > > >Length[data2]
>> > > >First[data2]
>> > > >
>> > > >Out[517]=
>> > > >1257
>> > > >
>> > > >Out[518]=
>> > > >{-31.4159,6.28224}
>> > > >
>> > > >In[519]:=
>> > > >y=a*Sin[d+e x]+c;
>> > > >sinFit2=NonlinearFit[Take[data2,40],y, x, {a,c, d,e}]
>> > > >
>> > > >Out[520]=
>> > > >6.\[InvisibleSpace]+2. Sin[122.381\[InvisibleSpace]+5. x]
>> > > >
>> > > >
>> > > >DisplayTogether[
>> > > >    Plot[sinFit1, {x, -10Pi, -8 Pi}, PlotStyle -> Blue],
>> > > >    Plot[sinFit2, {x, -10Pi, -8 Pi}, PlotStyle -> Red],
>> > > >
>> > > >    ListPlot[Take[data1, 40], PlotStyle -> {Blue, PointSize[0.011]}],
>> > > >    ListPlot[Take[data2, 30], PlotStyle -> {Red, PointSize[0.011]}],
>> > > >    Prolog -> {
>> > > >        {
>> > > >          Blue,
>> > > >          Text["sinFit1", {-27.5, 7}, {-1, 0},
>> > > >            TextStyle -> {FontFamily -> "Times",
>> > > >                FontWeight -> "Bold",
>> > > >                FontSize -> 18
>> > > >                }
>> > > >            ]
>> > > >          },
>> > > >
>> > > >        {
>> > > >          Red,
>> > > >          Text["sinFit2", {-26.9, 6}, {-1, 0},
>> > > >            TextStyle -> {FontFamily -> "Times",
>> > > >                FontWeight -> "Bold",
>> > > >                FontSize -> 18
>> > > >                }
>> > > >            ]
>> > > >          }
>> > > >        },
>> > > >    ImageSize -> 800,
>> > > >    Frame -> False,
>> > > >    AxesOrigin -> {-31.5, 4.0}
>> > > >
>> > > >    ]
>> > > >
>> > > >Cheers
>> > > >Yas
>> > > >
>> > > >
>> > > >
>> > > >On Aug 4, 2004, at 9:46 AM, Klingot wrote:
>> > > >
>> > > > > I'm trying to fit a sinusoidal function to my data using
>> > > > > 'NonlinearFit' but it's exhibiting rather odd behaviour. Please see my
>> > > > > examples below:
>> > > > >
>> > > > > EXAMPLE (1).
>> > > > >
>> > > > > **** As a test, I created a list of data from a function of the form y
>> > > > > = 6 + 2Sin[3 + x] with:
>> > > > >
>> > > > > datay = Table [6 + 2Sin[3 + x], {x, -10Pi, 10Pi, 0.05}];
>> > > > > datax = Table [x, {x, -10Pi, 10Pi, 0.05}];
>> > > > > data = Table[{datax[[i]], datay[[i]]}, {i, 1, Length[datay]}];
>> > > > >
>> > > > > **** Then tested to see whether NonlinearFit would correctly deduce
>> > > > > the equations parameters with:
>> > > > >
>> > > > > NonlinearFit[data, c + a Sin[d + e x], x, {a, c, d,e}]
>> > > > >
>> > > > > **** As expected, it gave me '6.+ 2. Sin[3. + 1. x]' ... exactly as
>> > > > > one would expect :)
>> > > > >
>> > > > >
>> > > > > EXAMPLE (2).
>> > > > >
>> > > > > **** Second test, I modified the equation by multiplying x by 5, ie.
>> > > > > y = 6 + 2Sin[3 + 5x]:
>> > > > >
>> > > > > datay = Table [6 + 2Sin[3 + 5 x], {x, -10Pi, 10Pi, 0.05}];
>> > > > > datax = Table [x, {x, -10Pi, 10Pi, 0.05}];
>> > > > > data = Table[{datax[[i]], datay[[i]]}, {i, 1, Length[datay]}];
>> > > > >
>> > > > > ***** and applied the NonlinearFit as before:
>> > > > >
>> > > > > NonlinearFit[data, c + a Sin[d + e x], x, {a, c, d,e}]
>> > > > >
>> > > > > ***** but this time I get a wildly innacurate result:    6.000165 +
>> > > > > 0.025086 Sin[0.0080308 - 0.247967 x]
>> > > > >
>> > > > > Specifically, the parameters 'a', 'd' and 'e' are all completely in
>> > > > > error by orders of magnitude.
>> > > > >
>> > > > > I tried extending the range of the data to include more cycles of the
>> > > > > sinusoid, thereby making it more continuous/monotonic but that made no
>> > > > > difference.
>> > > > >
>> > > > > Am I missing something fundamental here?
>> > > > >
>> > > > > Any assistance would be greatly appreciated.
>> > > > >
>> > > > > PS: I'm using Mathematica 5.0 on a MAC.
>> > > > >
>> > > >Dr. Yasvir A. Tesiram
>> > > >Associate Research Scientist
>> > > >Oklahoma Medical Research Foundation
>> > > >Free Radical Biology and Ageing Research Program
>> > > >825 NE 13th Street, OKC, OK, 73104
>> > > >
>> > > >P: (405) 271 7126
>> > > >F: (405) 271 1795
>> > > >E: yat at omrf.ouhsc.edu
>> > >
>>
>
>
>



-- 
DrBob at bigfoot.com
www.eclecticdreams.net
References:
- 'NonlinearFit` confusion
  - From: klingot@yahoo.com (Klingot)
- Re: 'NonlinearFit` confusion
  - From: Yasvir Tesiram <yat@omrf.ouhsc.edu>
- Re: Re: 'NonlinearFit` confusion
  - From: Yasvir Tesiram <tesiramy@omrf.ouhsc.edu>
Prev by Date: Re: Re: 'NonlinearFit` confusion
Next by Date: Re: Integral equations
Previous by thread: Re: Re: 'NonlinearFit` confusion
Next by thread: Re: Re: 'NonlinearFit` confusion