MathGroup Archive: April 2000 [00367]

[Date Index] [Thread Index] [Author Index]

Re: Gaussian fit

To: mathgroup at smc.vnet.net
Subject: [mg23187] Re: Gaussian fit
From: Wijnand Schepens <Wijnand.Schepens at rug.ac.be>
Date: Mon, 24 Apr 2000 01:11:58 -0400 (EDT)
Organization: RUG
References: <8dnv7a$hsu@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com

Gaussian fitting is indeed a frequent problem, and not so easy.
A few notes:
1. Fitting (by least squares) is normally a multiple-minima problem, i.e. the
parameters
that are found can be a local minimum for the
sum-of-square-deviations-function, but it needn't be the global minimum. In
general, there is no method to find the global minimum of an arbitrary
function.
In other words, Fit (or NonlinearFit) can produced terrible results because
of strange local minima in the minimization.
The only way out would be to give good initial guesses for the parameters,
and let Fit refine this. Unfortunately you can't do this for Fit (or, at
least I don't know how you can).
For NonlinearFit, you can.

3. Try it by hand first. It is very important to kwow the meaning of the
parameters:
This is a non-normalized gaussian
gauss[x_, a_, mu_, s_] := a Exp[-((x - mu)/s)^2]
x is the variable.
a is the amplitude (height at maximum)
mu is the mean position (where maximum occurs)
s is the dispersion (sqrt of variance) and is an indication of the width of
the clock-curve

In the case of your data, after some fiddling, I found
gauss[x,32.0, -0.25, 0.2]
wasn't a bad initial guess. "By hand".

You can try optimizing this using NonlinearFit, and feeding these parameter
values as starting values.

2. If you can use Fit somehow, this is usually a better option than using
NonlinearFit and coping the initial-parameters problem. Sometimes this can be
achieved by transforming the data.
For example if you want to fit a gaussian to your data, then this amounts to
fitting
a quadratic to the logarithm of the data (y-values)

A problem which arises here is that some y-values are 0, causing the Log to
diverge. For a real gaussian, these y-values will be small but always >0.
There are two options: leave these points out, or give them a value of, say,
0.001

Data2 = DeleteCases[Data, {x_, 0}]
tolog[{x_, y_}] := {x, Log[y]}
f = Fit[tolog /@ Data2, {1, x, x^2}, x]

This yields
2.683 - 1.02145 x - 6.32196 x^2
The gaussian then is Exp[f]
Try it by
Show[Plot[Exp[f], {x, -1, 1}], ListPlot[Data2, PlotJoined -> True]]

As you see, this isn't tremendous, but it's not nonsense either

4. Judging the quality of a fit is hard. It depends on what you find is most
important. For example, you may want to fit the large peak perfectly and
leave the "wings" bad as they are, or you may choose to take the best fit
over the whole domain.
This can be achieved by weighting functions, or by excluding datapoints from
the domain.

5. If you want to fit a sum of two gaussian, then you can't use the trick of
converting the data to Log, because Log isn't linear. Probably the best thing
to do would be to fit the large peak as well as possible (possibly by
excluding the rightmost datapoints), then subtracting this gaussian from the
data points and fitting another gaussian to this remainder.

Unfortunately, you will have to do a lot of fiddling. There is no magic
solution

Gordon Smith wrote:

> Greetings,
>
> I am trying to determine the best Gaussian fit to a series of data
> points.
>
> Data={{-0.95,0},{-0.85,0},{-0.75,0},{-0.65,1},{-0.55,2},{-0.45,9},{-0.3
> 5,26},{-0.25,36},{-0.15,23},{-0.05,15},{0.05,8},{0.15,7},{0.25,7},{0.35,1
> },{0.45,0},{0.55,2},{0.65,1},{0.75,0},{0.85,0},{0.95,0}}
>
> A more detailed histogram of this looks like
>
> {{0,-0.98},{0,-0.94},{0,-0.9},{0,-0.86},{0,-0.82},{0,-0.78},{0,-0.74},{0,
> -0.7},{0,-0.66},{1,-0.62},{0,-0.58},{1,-0.54},{2,-0.5},{5,-0.46},{3,-0.42
> },{10,-0.38},{14,-0.34},{10,-0.3},{13,-0.26},{15,-0.22},{10,-0.18},{9,-0.
> 14},{7,-0.1},{3,-0.06},{9,-0.02},{3,0.02},{5,0.06},{1,0.1},{4,0.14},{2,0.
> 18},{3,0.22},{4,0.26},{0,0.3},{0,0.34},{1,0.38},{0,0.42},{0,0.46},{0,0.5}
> ,{1,0.54},{1,0.58},{0,0.62},{1,0.66},{0,0.7},{0,0.74},{0,0.78},{0,0.82},{
> 0,0.86},{0,0.9},{0,0.94},{0,0.98}}
>
> I've tried using NonLinearFit, but I can't get the parameters to make
> sense.  There is also the possibility that this may be indicative of two
> processes superimposed upon each other, in which case, the fit would be
> two Gaussian distributions added together.  I've also looked through the
> Archives, but there doesn't seem to be much on this that is directly
> useful.  Either that, or the solution is so simple that I'm just not
> seeing it.
>
> Unfortunately, I don't have the capacity to formally join the MailList,
> but I was hoping that I might still be able to get a nudge in the right
> direction.
>
> Thanks for your help.
>
> --------------------------------------------------------
> Gordon P. Smith
> Journeyman Physicist
> University of Mississippi
> 662.915.5635

Prev by Date: Re: A simple programming question.

Next by Date: Re: Demonstrate that 1==-1

Previous by thread: Re: Gaussian fit

Next by thread: Startup Notebook?