MathGroup Archive: September 2007 [00209]

[Date Index] [Thread Index] [Author Index]
Re: Re: fit a BinomialDistribution to exptl data?
To: mathgroup at smc.vnet.net
Subject: [mg80969] Re: [mg80915] Re: fit a BinomialDistribution to exptl data?
From: DrMajorBob <drmajorbob at bigfoot.com>
Date: Thu, 6 Sep 2007 05:35:39 -0400 (EDT)
References: <11420600.1188458042685.JavaMail.root@m35> <6526955.1189001025956.JavaMail.root@m35>
Reply-to: drmajorbob at bigfoot.com
Thanks for that! Very instructive.

I'd only suggest that

nmax = SortBy[Transpose[{Range[300], nposteriorList}], Last][[-1, 1]]

is easier accomplished by

nmax = First@Ordering[nposteriorList, -1]

Bobby

On Wed, 05 Sep 2007 01:58:09 -0500, Mark Fisher <particlefilter at gmail.com>  
wrote:

> On Sep 1, 12:43 am, Ray Koopman <koop... at sfu.ca> wrote:
>> Also, it is possible that
>> the ML estimate of n may not exist because the likelihood, maximized
>> w.r.t. p given n, may be monotone increasing in n, as is the case
>> with the following 200 samples from a Binomial[20,.2] distribution:
>>
>> {x = {0,  1,  2,  3,  4,  5,  6, 7, 8, 9, 10, 11},
>>  f = {3, 21, 28, 37, 43, 28, 22, 8, 6, 2,  1,  1}};
>>
>> which has {mean, variance} = {3.875, 4.02952}. It is not obvious
>> (except perhaps to a Bayesian) what should be done in cases such as
>> this.
>
> It seems this thread, which began as a discussion of fitting a
> functional forms, has evolved into a discussion of statistical
> inference. (The two subjects are related, but distinct.) My post is
> intended to contribute to the statistical inference discussion by
> providing a Bayesian perspective. (The code is written using Version
> 6.)
>
> The Bayesian approach produces probability statements about the
> "unknowns" given the the data. In this context, the data are those
> given by Ray Koopman and the unknowns are n and p, the parameters of
> the binomial distribution. In order to make probability statements
> about about the joint distribution for (n,p) given the data, the rules
> of probability require us to specify a prior distribution that
> characterizes what we knew before we saw the data.
>
> Some notation will be helpful. Let f(n,p) denote the joint prior
> distribution, let f(n,p|D) denote the joint posterior distribution
> given the data D, and let f(D|n,p) denote the sampling distribution of
> the data given the parameters. (In a Bayesian analysis, the sampling
> distribution becomes the "likelihood" of the parameters given the
> data.) Bayes' rule says f(n,p|D) is proportional to g(n,p) :=  f(n,p)
> * f(D|n,p). The typical procedure is to generate draws from f(n,p|D)
> using MCMC techniques (Markov Chain Monte Carlo).
>
> In this example, I take the following route instead. First, factor the
> prior into marginal and conditional: f(n,p) = f(n) * f(p|n). Second,
> integrate p out of both sides of Bayes' rule, so that f(n|D) is
> proportional to g(n) := f(D|n) * f(n). Finally, generate draws from
> the joint posterior by drawing successively from f(n|D) and then from
> f(p|n,D), the latter of which is proportional to g(p) := f(D|n,p) *
> f(p|n).
>
> For this example, I characterize the prior knowledge about n and p as
> follows: (1) n and p are independent, so that f(n,p) = f(n) * f(p),
> (2) all values of p are equally likely, so that f(p) = 1 for 0 <= p <=
> 1, and (3) the probability of n is given by f(n) = .03*(1-.03)^(n-1) .
> (I'll comment on these choices at the end.)
>
> pprior[p_] = 1
> nprior[n_] = .03*(1-.03)^(n-1)
>
> The data are those given by Ray Koopman:
>
> data0 = Transpose[{
> 	{0,  1,  2,  3,  4,  5,  6, 7, 8, 9, 10, 11},
> 	{3, 21, 28, 37, 43, 28, 22, 8, 6, 2,  1,  1}}];
> data = Flatten[ConstantArray @@@ data0];
>
> The likelihood:
>
> jointlikelihood[n_, p_] =
>   Times @@ (PDF[BinomialDistribution[n, p], #] & /@ data);
>
> An aside: Ray Koopman's comment about the absence of a maximum refers
> to h(n) := max_p f(D|n,p), which is increasing in n. This is
> illustrated by
>
> ListLinePlot[Table[
> 	FindMaximum[Log[jointlikelihood[n, p]], {p, .05, .06}][[1]],
> 	{n, 11, 100}], DataRange -> {11, 100}]
>
> Now compute f(D|n) by *integrating* p out (rather than *maximizing* p
> out). (Although I specify the assumption n >= 11, the resulting
> expression is valid for n < 11 as well: the likelihood of n for n < 11
> is zero.) Even ignoring the prior for n (which downweights large
> values of n), the marginal likelihood for n has a maximum (at n=53):
>
> nlikelihood[n_] = Integrate[jointlikelihood[n, p] * pprior[p],
> 	{p, 0, 1}, Assumptions -> n >= 11];
>
> Compute f(n|D) from f(D|n) * f(n). (I have chosen the upper bound of
> 300 for convenience; the prior probability of n > 300 is about
> 10^(-4).) From f(n|D), i.e., nposteriorList, the most likely value for
> n is 36 (nmax) and, given n=nmax, the maximum value for p is about
> 0.11 (pmax):
>
> nposteriorList = #/Total[#]& @ Table[nlikelihood[n] nprior[n], {n, 1,
> 300}];
> ListLinePlot[nposteriorList, PlotRange -> All]
> nmax = SortBy[Transpose[{Range[300], nposteriorList}], Last][[-1, 1]]
> pmax = p /. Last@FindMaximum[Log[jointlikelihood[nmax, p]], {p, .1, .
> 15}]
>
> Now compute f(n,p|D) by drawing from f(n|D) and f(p|n,D). By examining
> the f(D|n,p), one can determine that f(p|n,D) =
> PDF[BetaDistribution[776, 200 n - 774], p]. We see that n and p are
> highly negatively correlated and the the most likely value for p is
> noticably less than pmax:
>
> draws = Table[With[{
> 	n = RandomChoice[nposteriorList[[11 ;;]] -> Range[11,300]]},
> 	{n, RandomReal[BetaDistribution[776, 200 n - 774]]}], {10^5}];
> bc = BinCounts[draws, {.5, 300.5, 1}, {0, .35, .005}];
> ListDensityPlot[Transpose[bc],
> 	ColorFunction -> (GrayLevel[1 - #^(.5)] &),
> 	DataRange -> {{1, 300}, {0, .35}}, PlotRange -> All]
> ListLinePlot[Total[bc, {2}], DataRange -> {1, 300},
> 	GridLines -> {{nmax}, {}}, PlotRange -> All]
> ListLinePlot[Total[bc], DataRange -> {0, .35},
> 	GridLines -> {{pmax}, {}}]
>
> Now that we have the joint posterior distribution, we can compute the
> posterior distribution of any function of n and p. For example the
> mean of the binomial distribution, n*p, which is tightly concentrated
> around mean of the data (3.875):
>
> Through[{Mean, StandardDeviation}[Times @@@ draws]]
>
> We can also compute the uncertainty associated with the binomial CDF
> (which connects us up with fitting functional forms):
>
> Needs["ErrorBarPlots`"]
> meansd = (Through[{Mean, StandardDeviation}[#]] & /@
> 	Transpose[Function[{n, p},
> 	Table[CDF[BinomialDistribution[n, p], x], {x, 0, 15}] // Evaluate]
> @@@
> 	Take[draws, 1000]]);
> ErrorListPlot[Transpose[{
> 	Transpose[{Range[0, 15], meansd[[All, 1]]}], ErrorBar[2 #] & /@
> 	meansd[[All, 2]]}]]
>
> In Bayesian analysis, one uses "decision theory" to get a "point
> estimate". A point estimate minimizes the expected loss computed with
> respect to the posterior distribution, where the "loss function"
> characterizes the cost of choosing the point (n1,p1) when in fact
> (n0,p0) is true. For example, quadratic loss leads to choosing the
> posterior mean, while a "zero-one" loss function leads to choosing the
> maximum posterior value.
>
> The prior I chose above is intended to illustrate the analysis. In a
> real-world situation with real-world data, one should incorporate what

> one knows into the prior. For example, one might know from experience
> that the probability that p is greater than 1/2 is quite small; or one=

> might know that n can't be larger than 35. Ultimately, if the thing
> you care about is sensitive to the prior, then you should think
> carefully about the appropriate prior.
>
> --Mark
>
>
>



-- =

DrMajorBob at bigfoot.com
Prev by Date: Re: Problem in Solving Double Integral for PDF transformation
Next by Date: Saving Interpolated Function
Previous by thread: Re: fit a BinomialDistribution to exptl data?
Next by thread: is there a better way to do constraint logic programming in Mathematica?