Re: Gaussian fit

*To*: mathgroup at smc.vnet.net*Subject*: [mg23187] Re: Gaussian fit*From*: Wijnand Schepens <Wijnand.Schepens at rug.ac.be>*Date*: Mon, 24 Apr 2000 01:11:58 -0400 (EDT)*Organization*: RUG*References*: <8dnv7a$hsu@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

Gaussian fitting is indeed a frequent problem, and not so easy. A few notes: 1. Fitting (by least squares) is normally a multiple-minima problem, i.e. the parameters that are found can be a local minimum for the sum-of-square-deviations-function, but it needn't be the global minimum. In general, there is no method to find the global minimum of an arbitrary function. In other words, Fit (or NonlinearFit) can produced terrible results because of strange local minima in the minimization. The only way out would be to give good initial guesses for the parameters, and let Fit refine this. Unfortunately you can't do this for Fit (or, at least I don't know how you can). For NonlinearFit, you can. 3. Try it by hand first. It is very important to kwow the meaning of the parameters: This is a non-normalized gaussian gauss[x_, a_, mu_, s_] := a Exp[-((x - mu)/s)^2] x is the variable. a is the amplitude (height at maximum) mu is the mean position (where maximum occurs) s is the dispersion (sqrt of variance) and is an indication of the width of the clock-curve In the case of your data, after some fiddling, I found gauss[x,32.0, -0.25, 0.2] wasn't a bad initial guess. "By hand". You can try optimizing this using NonlinearFit, and feeding these parameter values as starting values. 2. If you can use Fit somehow, this is usually a better option than using NonlinearFit and coping the initial-parameters problem. Sometimes this can be achieved by transforming the data. For example if you want to fit a gaussian to your data, then this amounts to fitting a quadratic to the logarithm of the data (y-values) A problem which arises here is that some y-values are 0, causing the Log to diverge. For a real gaussian, these y-values will be small but always >0. There are two options: leave these points out, or give them a value of, say, 0.001 Data2 = DeleteCases[Data, {x_, 0}] tolog[{x_, y_}] := {x, Log[y]} f = Fit[tolog /@ Data2, {1, x, x^2}, x] This yields 2.683 - 1.02145 x - 6.32196 x^2 The gaussian then is Exp[f] Try it by Show[Plot[Exp[f], {x, -1, 1}], ListPlot[Data2, PlotJoined -> True]] As you see, this isn't tremendous, but it's not nonsense either 4. Judging the quality of a fit is hard. It depends on what you find is most important. For example, you may want to fit the large peak perfectly and leave the "wings" bad as they are, or you may choose to take the best fit over the whole domain. This can be achieved by weighting functions, or by excluding datapoints from the domain. 5. If you want to fit a sum of two gaussian, then you can't use the trick of converting the data to Log, because Log isn't linear. Probably the best thing to do would be to fit the large peak as well as possible (possibly by excluding the rightmost datapoints), then subtracting this gaussian from the data points and fitting another gaussian to this remainder. Unfortunately, you will have to do a lot of fiddling. There is no magic solution Gordon Smith wrote: > Greetings, > > I am trying to determine the best Gaussian fit to a series of data > points. > > Data={{-0.95,0},{-0.85,0},{-0.75,0},{-0.65,1},{-0.55,2},{-0.45,9},{-0.3 > 5,26},{-0.25,36},{-0.15,23},{-0.05,15},{0.05,8},{0.15,7},{0.25,7},{0.35,1 > },{0.45,0},{0.55,2},{0.65,1},{0.75,0},{0.85,0},{0.95,0}} > > A more detailed histogram of this looks like > > {{0,-0.98},{0,-0.94},{0,-0.9},{0,-0.86},{0,-0.82},{0,-0.78},{0,-0.74},{0, > -0.7},{0,-0.66},{1,-0.62},{0,-0.58},{1,-0.54},{2,-0.5},{5,-0.46},{3,-0.42 > },{10,-0.38},{14,-0.34},{10,-0.3},{13,-0.26},{15,-0.22},{10,-0.18},{9,-0. > 14},{7,-0.1},{3,-0.06},{9,-0.02},{3,0.02},{5,0.06},{1,0.1},{4,0.14},{2,0. > 18},{3,0.22},{4,0.26},{0,0.3},{0,0.34},{1,0.38},{0,0.42},{0,0.46},{0,0.5} > ,{1,0.54},{1,0.58},{0,0.62},{1,0.66},{0,0.7},{0,0.74},{0,0.78},{0,0.82},{ > 0,0.86},{0,0.9},{0,0.94},{0,0.98}} > > I've tried using NonLinearFit, but I can't get the parameters to make > sense. There is also the possibility that this may be indicative of two > processes superimposed upon each other, in which case, the fit would be > two Gaussian distributions added together. I've also looked through the > Archives, but there doesn't seem to be much on this that is directly > useful. Either that, or the solution is so simple that I'm just not > seeing it. > > Unfortunately, I don't have the capacity to formally join the MailList, > but I was hoping that I might still be able to get a nudge in the right > direction. > > Thanks for your help. > > -------------------------------------------------------- > Gordon P. Smith > Journeyman Physicist > University of Mississippi > 662.915.5635