Re: Second Opinion
- To: mathgroup at smc.vnet.net
- Subject: [mg26386] Re: [mg26373] Second Opinion
- From: Tomas Garza <tgarza01 at prodigy.net.mx>
- Date: Sat, 16 Dec 2000 02:40:15 -0500 (EST)
- Sender: owner-wri-mathgroup at wolfram.com
It appears to me that what you have as P(n) is *not* the Poisson
distribution. It actually seems to be the complement of the distribution
function of the gamma probability law with parameters N and n (so you
should actually write P(n, N)). If you have access to E. Parzen's book
Modern Probability Theory and Its Applications (J. Wiley, 1960) you'll
find there a nice explanation of this topic (Ch. 6, Sec. 4). Suppose you
have a series of events occurring in time in accordance with a Poisson
probability law at the rate of n events per unit of time; then what you
have as P(n, N) is the probability that the time of occurrence of the
N-th event will be less than or equal to one (one time unit, that is).
But, in any case, I don't get your point. I understand you would like to
determine the value of the parameter N given the probability P(n, N) and
the value of the parameter n. My question is, if you know the value of
n, why should you want to work with a "surface curve" (whatever that is
meant to be) instead of just the distribution P(n, N)? Bear in mind that
N has to be a nonnegative integer, so that the notion of a surface P(n,
N) doesn't make much sense.
Now, you speak of N running from 1 to 140. What values of n do you use?
I don't get all that many meaningful values of N (i.e., such that P is
different from 1 and greater than, say, 10^-6) for values of n running
from 1 to 100.
And, then, even if you get a fitted model representing P as a function
of n and N (which is not at all straightforward), I don't think this
will be useful to estimate N for given values of P and n. Just to give
an idea of what's going on, suppose you choose only 3 values of n, say n
= 100, 110 and 120. Then evaluate the function P for all meaningful
values of N.
In[1]:=
<< Statistics`DataManipulation`
In[2]:=
ef[0, n_] := Exp[-n];
ef[j_, n_] := ef[j - 1, n]*n/j
This gives you a recursion formula for evaluating the Poisson terms. Now
I construct a table of values for the Poisson terms which are less than
6 standard deviations from the mean so as not to have useless things:
In[4]:=
fProb[n_] :=
Table[{j, ef[j, n]}, {j, Max[0, n - Ceiling[6*Sqrt[n]]],
n + Ceiling[6*Sqrt[n]]}] // N
and then 1 minus the partial sums to get your function P:
In[5]:=
funP[n_] :=
Transpose[{Transpose[fProb[n]][[1]],
1 - CumulativeSums[Transpose[fProb[n]][[2]]]}]
Now compute the above for three different values of n. By visual
inspection I keep only those meaningful values such that the size of
each table is the same:
In[6]:=
ft = Select[
Join[funP[100], funP[110], funP[120]], #[[1]] >= 54 && #[[1]] <=
160 &];
I now get an interpolation function on the probabilities:
In[7]:=
gr = Partition[#[[2]] & /@ ft, 107];
In[8]:=
q = ListInterpolation[gr, {{1, 3}, {0, 106}}]
ListInterpolation::"inhr": "Requested order is too high; order has been
\
reduced to \!\({2, 3}\)."
Out[8]=
InterpolatingFunction[{{1., 3.}, {0., 106.}}, "<>"]
This function q has the same purpose as a "fitted model", in the sense
that, given values of n and N, it will produce P(n, N). You may now plot
the surface:
In[9]:=
Plot3D[q[x, y], {x, 1, 3}, {y, 0, 106},
Ticks -> {{{1, "100"}, {2, "110"}, {3, "120"}}, Automatic, Automatic}]
The resulting surface is apparently what you were looking for. But,
then, what? Given a value of n and a value of P you can hardly determine
from there the value of N. In fact, you would need some kind of inverse
function to do that numerically. Perhaps some more background on your
problem would be useful.
Tomas Garza
Mexico City
"John Lai" <john.lai at worldnet.att.net> wrote:
> Hello all,
> I tried to calculate Poisson Distribution in a backdoor way and used
> mathematica to model it. I could not get what I wanted. I don't
think it
> is mathematica problem and more than likely my method is flawed. So I
toss
> this out to see if some of you may spot my error.
>
> Poisson Distribution,P(n) =1-Summation [exp(-n)*(n^x)]/Factorial(x)
where x
> goes from 0 to N-1
>
> For given n and N, P(n) can be determined easily. However, I want to
> determine N if P(n) and n are specified and I do not want to get
access to
> Poisson lookup table. My idea is to calculate P(n) with a series of n
and N
> (essentially generating the tables). Plot a surface curve whose
variables
> are n, P(n) and N. The idea was once this surface is obtained, with
x-axis
> as n, y-axis as P(n) and z-axis as N, then for a given n and P(n) I
can
> obtain N.
>
> I wrote a C program to generate P(n) and use mathematica to plot this
> surface. I have 14 sets of n and in each set of n, I have 139
variables
> (i.e. N runs from 1 to 140 ), so there are 139 corresponding values of
P(n)
> for each n. When I tried to use the function Fit to estimate this
surface,
> it took about =BD hr for my 500MHz desktop to calculate! And the
resultant
> expression is huge!
>
> Then, I cut down the dimension of my data set. For each n, I
generated 10
> values of N and repeated the process again. However, no matter what
> combination of polynomial I used
(x,x^-1,Exp(-x),Exp(-x^2),Exp(-x-y).), the
> resulting equation of the surface is meaningless. It doesn't look
right (at
> least I expected it to resemble some sort of Poisson or even Gaussian
shape)
> and substituting P(n) and n back, I got garbage. I have enclosed a
.nb file
> for reference. [Contact the author to obtain this file - moderator]
>
> So after all this, does it mean that my scheme of calculating Poisson
> Distribution is fundamentally wrong?
> Any suggestions are appreciated and thanks in advance.