MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: fit a BinomialDistribution to exptl data?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg80590] Re: [mg80473] Re: [mg80415] fit a BinomialDistribution to exptl data?
  • From: DrMajorBob <drmajorbob at bigfoot.com>
  • Date: Sun, 26 Aug 2007 04:30:56 -0400 (EDT)
  • References: <200708220838.EAA08485@smc.vnet.net> <23066758.1187864982434.JavaMail.root@m35>
  • Reply-to: drmajorbob at bigfoot.com

> Floor[n] is the value of n to take.

Is there a simple rationale for that?

It seems to me the optimal integer n could lie on EITHER side of  
FindMaximum's or FindFit's optimal Real.

Bobby

On Thu, 23 Aug 2007 00:08:35 -0500, Darren Glosemeyer  
<darreng at wolfram.com> wrote:

> Gordon Robertson wrote:
>> Given a list of data values, or a list of x-y data points for
>> plotting the data as an empirical distribution function, how can I
>> fit a BinomialDistribution to the data? The help documentation for
>> FindFit shows examples in which the user indicates which function
>> should be fit (e.g. FindFit[data, a x Log[b + c x], {a, b, c}, x]),
>> and I've been unable to find an example in which a statistical
>> distribution is being fit to data. Mathematica complains when I try the
>> following with an xy list of data that specified an EDF: FindFit
>> [xyvals, CDF[BinomialDistribution[n, pp], k], {n, pp}, k].
>>
>> G
>> --
>> Gordon Robertson
>> Canada's Michael Smith Genome Sciences Centre
>> Vancouver BC Canada
>>
>>
>
> Non-default starting values are needed. By default, FindFit will use a
> starting value of 1 for each parameter, which will be problematic in
> this case.  The starting value for n should be at least as large as the
> largest binomial in the sample, and the value for pp should be strictly
> between 0 and 1. Here is an example.
>
> In[1]:= binom = RandomInteger[BinomialDistribution[20, .4], 10]
>
> Out[1]= {10, 7, 5, 9, 8, 12, 7, 7, 10, 9}
>
> In[2]:= edf = Sort[Tally[binom]];
>
> In[3]:= edf[[All, 2]] = Accumulate[edf[[All, 2]]]/Length[binom];
>
> In[4]:= FindFit[edf,
>          CDF[BinomialDistribution[n, pp], k], {{n, Max[binom] + 1}, {pp,
> .5}},
>           k]
>
> Out[4]= {n -> 17.3082, pp -> 0.4773}
>
>
> Floor[n] is the value of n to take.
>
> Note that FindFit gives a least squares fit of the edf to the cdf.
> Alternatively, a maximum likelihood estimate of the parameters can be
> obtained by maximizing the log likelihood function (the sum of the logs
> of the pdf with unknown parameters evaluated at the data points) with
> respect to the parameters.
>
>
> In[5]:= loglike 
>           PowerExpand[
>            Total[Log[Map[PDF[BinomialDistribution[n, pp], #] &,  
> binom]]]];
>
>
> Constraints should be used to keep the parameters in the feasible range.
>
> In[6]:= FindMaximum[{loglike, n >= Max[binom] && 0 < pp < 1}, {n,
>           Max[binom] + 1}, {pp, .5}]
>
> Out[6]= {-20.6326, {n -> 14.9309, pp -> 0.56259}}
>
>
> Darren Glosemeyer
> Wolfram Research
>
>



-- 

DrMajorBob at bigfoot.com


  • Prev by Date: Switching between open notebooks
  • Next by Date: Re: Improper Integral & Mathematica
  • Previous by thread: Re: fit a BinomialDistribution to exptl data?
  • Next by thread: Re: Re: Re: fit a BinomialDistribution