MathGroup Archive 2003

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Random Number Generation : Is there a faster way of doing this ?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg43210] Re: [mg43190] Random Number Generation : Is there a faster way of doing this ?
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Tue, 19 Aug 2003 07:53:14 -0400 (EDT)
  • References: <200308170828.EAA04316@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

"Christos Argyropoulos M.D." wrote:
> 
> Hi,
> 
> My problem is that I want to generate a large number of samples from Poisson
> distributions. Specifically the user has to supply a list of the parameters
> for a number of Poisson distributions and then generate a number of such
> samples. In the function below poissonparams is a list of the parameters of
> the Poisson distributions, and size is the number of samples. In my
> application (Analysis of counts of digital gene expression profiles), I
> supply a list of 10000 Poisson Parameters and have to generate a big sample
> of numbers drawn from the 10000 distributions. Typically > 1k such samples
> are required in order to establish a Log Likelihood statistic cutoff down
> the road.
> For the application at hand the poisson parameter can be zero (in this case
> the only point I can sample is the number zero) that is way I do not Map .
> My implementation shown below is ok, except for a small problem; it takes
> too long. In order to generate 5000 samples from each of the 10000
> distributions (this had to be repeated thrice, since I had to analyse 3
> profiles) i.e. total 150 million random numbers my system took close to 13
> hours. The required overhead to periodically save the data to the HD is a
> very small fraction of the total processing time and given limitations of
> available RAM cannot be improved.
> The question is, is there a faster way of doing this in Mathematica or
> should I be prepared to move outside the kernel for the RNG part?
> 
> I am using Mathematica 4.2 on a P4 at 2.2 GHz with 256 DDR RAM running
> Windows XP
> 
> Simulate[poissonparams_List,size_Integer]:=
> Module[{},
> fnd[x_]:=
> Map[If[#\[Equal]0,0,Random[PoissonDistribution[#]]]&,
> poissonparams];
> Array[fnd,{size}]
> ]
> 
> example of using the code :
> 
> In[40]:=
> Simulate[{1,2,3,0.5,0},20]
> 
> Out[40]=
> {{1,1,3,1,0},{1,1,3,1,0},{0,3,1,0,0},{3,4,5,0,0},{2,0,3,0,0},{4,4,3,0,0},{0,
> 2,
> 
> 5,0,0},{2,1,2,1,0},{2,1,1,0,0},{2,2,2,2,0},{0,2,4,0,0},{0,2,1,0,0},{2,0,2,
> 
> 1,0},{2,4,1,0,0},{0,2,6,2,0},{1,0,4,3,0},{0,2,1,0,0},{0,0,5,0,0},{1,0,3,0,
>     0},{1,1,3,0,0}}
> 
> I.e. generates  20 samples from the 5 Poisson Distributions with Parameter
> lamda 1, 2, 3, 0.5, and zero
> 
> Thanks In advance,
> 
> Christos Argyropoulos

I think something along the lines below may help. The underlying
assumptions are as follows.

(1) You want to generate random arrays, of reasonable length, with
respect to given values of the Poisson parameter 'mu'.
(2) Either you have "reasonable" values for 'mu' or else you know what
will suffice to modify the code below.
(3) You have sufficient expertise (since I do not) to figure out whether
I made any mistakes below, and correct them if so.

I did a few things. First, after looking at the code in
Statistics`DiscreteDistributions.m, I decided that a slow step was
incessant repetition of numeric evaluation of CDF values. Next I decided
that provided we can fix in advance a maximum set of such required for a
given 'mu' (I use 32 below), then we can use Compile to advantage. Below
is code to do this. You will notice that it is similar to code in the
above mentioned package but I have stripped out a few things such as
parameter value checking. Hence it may have bad consequences if used
with inappropriate values.

myPoissonRandomArrayC = Compile[{{mu,_Real},{len,_Integer}}, Module[
  {tabl, high, low, mid, res, q},
  If[mu==0, Return[Table[0,{len}]]];
  tabl = Table[GammaRegularized[1.+j,mu],{j,32}];
  res = Table[
    q = Random[];
    If [Exp[-mu] < q,
      high = 1;
      While[tabl[[high]] < q, high *= 2];
      low = Round[high/2];
      While[high - low > 1,
        mid = Round[(high+low)/2];
        If[tabl[[mid]] < q,
          low = mid,
          high = mid
          ]
        ];
	  high,
      0
      ], {len}
	];
	res
  ]];

To run an example similar to yours I need to find lists of random values
for each given 'mu', then transpose.

simulate2[pp_List, size_Integer] :=
  Transpose[Map[myPoissonRandomArrayC[#, size]&, pp]]

In[137]:= Timing[ss2 = simulate2[{1,2,3,0.5,0.}, 2^12];]
Out[137]= {0.06 Second, Null}

To compare timings:

simulate[poissonparams_List,size_Integer] := Module[
  {},
  fnd[x_]:=
    Map[If[#==0,0,Random[PoissonDistribution[#]]]&, poissonparams];
  Array[fnd,{size}]
  ]

In[140]:= Timing[ss = simulate[{1,2,3,0.5,0.001}, 2^12];]
Out[140]= {7.71 Second, Null}

This roughly corresponds to your 13 hour result for generating 150
million, given that I am working on a 1.5 GHz machine. So using Compile
and a table to store CDF values we can obtain a bit over two orders of
magnitude improvement in speed.


Daniel Lichtblau
Wolfram Research


  • Prev by Date: Re: Random Number Generation : Is there a faster way of doing this?
  • Next by Date: Leibniz Definition Of Pi Not In 5.0.0?
  • Previous by thread: Re: Random Number Generation : Is there a faster way of doing this ?
  • Next by thread: Re: Random Number Generation : Is there a faster way of doing this ?