Re: Re: fit a BinomialDistribution to exptl data?
- To: mathgroup at smc.vnet.net
- Subject: [mg80590] Re: [mg80473] Re: [mg80415] fit a BinomialDistribution to exptl data?
- From: DrMajorBob <drmajorbob at bigfoot.com>
- Date: Sun, 26 Aug 2007 04:30:56 -0400 (EDT)
- References: <200708220838.EAA08485@smc.vnet.net> <23066758.1187864982434.JavaMail.root@m35>
- Reply-to: drmajorbob at bigfoot.com
> Floor[n] is the value of n to take. Is there a simple rationale for that? It seems to me the optimal integer n could lie on EITHER side of FindMaximum's or FindFit's optimal Real. Bobby On Thu, 23 Aug 2007 00:08:35 -0500, Darren Glosemeyer <darreng at wolfram.com> wrote: > Gordon Robertson wrote: >> Given a list of data values, or a list of x-y data points for >> plotting the data as an empirical distribution function, how can I >> fit a BinomialDistribution to the data? The help documentation for >> FindFit shows examples in which the user indicates which function >> should be fit (e.g. FindFit[data, a x Log[b + c x], {a, b, c}, x]), >> and I've been unable to find an example in which a statistical >> distribution is being fit to data. Mathematica complains when I try the >> following with an xy list of data that specified an EDF: FindFit >> [xyvals, CDF[BinomialDistribution[n, pp], k], {n, pp}, k]. >> >> G >> -- >> Gordon Robertson >> Canada's Michael Smith Genome Sciences Centre >> Vancouver BC Canada >> >> > > Non-default starting values are needed. By default, FindFit will use a > starting value of 1 for each parameter, which will be problematic in > this case. The starting value for n should be at least as large as the > largest binomial in the sample, and the value for pp should be strictly > between 0 and 1. Here is an example. > > In[1]:= binom = RandomInteger[BinomialDistribution[20, .4], 10] > > Out[1]= {10, 7, 5, 9, 8, 12, 7, 7, 10, 9} > > In[2]:= edf = Sort[Tally[binom]]; > > In[3]:= edf[[All, 2]] = Accumulate[edf[[All, 2]]]/Length[binom]; > > In[4]:= FindFit[edf, > CDF[BinomialDistribution[n, pp], k], {{n, Max[binom] + 1}, {pp, > .5}}, > k] > > Out[4]= {n -> 17.3082, pp -> 0.4773} > > > Floor[n] is the value of n to take. > > Note that FindFit gives a least squares fit of the edf to the cdf. > Alternatively, a maximum likelihood estimate of the parameters can be > obtained by maximizing the log likelihood function (the sum of the logs > of the pdf with unknown parameters evaluated at the data points) with > respect to the parameters. > > > In[5]:= loglike > PowerExpand[ > Total[Log[Map[PDF[BinomialDistribution[n, pp], #] &, > binom]]]]; > > > Constraints should be used to keep the parameters in the feasible range. > > In[6]:= FindMaximum[{loglike, n >= Max[binom] && 0 < pp < 1}, {n, > Max[binom] + 1}, {pp, .5}] > > Out[6]= {-20.6326, {n -> 14.9309, pp -> 0.56259}} > > > Darren Glosemeyer > Wolfram Research > > -- DrMajorBob at bigfoot.com
- Follow-Ups:
- Re: Re: Re: fit a BinomialDistribution
- From: Darren Glosemeyer <darreng@wolfram.com>
- Re: Re: Re: fit a BinomialDistribution
- References:
- fit a BinomialDistribution to exptl data?
- From: Gordon Robertson <agrobertson@telus.net>
- fit a BinomialDistribution to exptl data?