Re: Cumulative probability question
- To: mathgroup at smc.vnet.net
- Subject: [mg120619] Re: Cumulative probability question
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Sun, 31 Jul 2011 07:26:14 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
----- Original Message ----- > From: "Peter Sisak" <p-kun80 at hotmail.com> > To: mathgroup at smc.vnet.net > Sent: Saturday, July 30, 2011 4:59:41 AM > Subject: Cumulative probability question > I have been experimenting with Mathematica in order to receive answers > to my problem, so far without success. I have tried various forms of > CDF > and HypergeometricDistribution, but I seem to not know the proper > addressing of the parameters for the problem. > > The problem itself is the following (with numbers given for a more > illustrative example): Given an urn containing N(=70) items in total, > of which n(=4) are marked, we want to draw i(=2) or more marked > items. > > a) What is the formula describing the number of draws required to > succeed in drawing at least i items with a probability of at least > 50%? > b) How do you make a graph of the probability (of drawing at least i > items) graphed against the number of draws? > c) What are the equations that need to be solved to get a numerical > answer on a) and b)? > > Thank you for your assistance in advance > P=E9ter Sis=E1k I'll use notation n1 for the marked items, n2 for unmarked (so n1 is your n, and n1+n2 is your N, which is quite different from Mathematica's N). Suppose you do m draws. The probability of drawing from the n1 marked elements, in the first k draws, and drawing from the n2 unmarked on the remaining m-k draws, is Pochhammer[n1 - k + 1, k]* Pochhammer[n2 - (m - k) + 1, m - k]/Pochhammer[n1 + n2 - m + 1, m] If this seems like a bizarre formula, rewrite it in factorials and it should make more sense. Assuming I have it correct. To get the probability of exactly k form the n1 marked elements, appearing in any of the m draws, multiply by Binomial[m,k]. To get the probability of at least k marked elements appearing, sum the result from k to either of m or n1 (does not matter which because all terms after the min are zero). In[40]:= n1 = 4; n2 = 66; p[k_, m_, n1_, n2_] := Binomial[m, k]*Pochhammer[n1 - k + 1, k]* Pochhammer[n2 - (m - k) + 1, m - k]/Pochhammer[n1 + n2 - m + 1, m] In[43]:= prob[k_, m_, n1_, n2_] := Sum[p[j, m, n1, n2], {j, k, n1}] Your example: In[44]:= FindRoot[prob[2, m, 4, 66] == 1/2, {m, 20}] Out[44]= {m -> 26.92171019255432} So it's 27 draws you need to hit an expected 2 or more marked elements at least half the time. One can simulate these draws as follows. Take a random sample of m elements without replacement. See how many lie among the first n1 elements. If i or more, count the sample as passing the test. Then count how many tests pass on average. simulate[n1_, n_, k_, m_, reps_: 100] := Count[Table[ Count[RandomSample[Range[n], m], Alternatives @@ Range[n1]] >= k, {reps}], True]/reps the difference between 26 and 27 draws is noticeable. Not that I know what stats tests to use in order to quantify this. In[61]:= Table[simulate[n1, n1 + n2, 2, 26, 100] // N, {10}] Out[61]= {0.55, 0.41, 0.49, 0.47, 0.46, 0.44, 0.49, 0.49, 0.41, 0.49} In[62]:= Table[simulate[n1, n1 + n2, 2, 27, 100] // N, {10}] Out[62]= {0.59, 0.51, 0.4, 0.48, 0.48, 0.53, 0.5, 0.47, 0.55, 0.43} Here is a neat sort of test. Run this simulation many times. Count the entire run a success if at least half the runs show a success at least half the time. success[n1_, n2_, k_, m_, reps_] := Count[Table[simulate[n1, n1 + n2, k, m, reps], {reps}], aa_ /; aa >= 1/2] >= reps/2 In[68]:= Table[success[n1, n2, 2, 26, 100], {10}] Out[68]= {False, False, False, False, False, False, False, False, \ False, False} In[69]:= Table[success[n1, n2, 2, 27, 100], {10}] Out[69]= {True, True, True, True, True, False, True, True, True, True} These results will have to speak for themselves, since I don't actually know their language. Daniel Lichtblau Wolfram Research