Re: Random Number Generation : Is there a faster way of doing this ?
- To: mathgroup at smc.vnet.net
- Subject: [mg43210] Re: [mg43190] Random Number Generation : Is there a faster way of doing this ?
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Tue, 19 Aug 2003 07:53:14 -0400 (EDT)
- References: <200308170828.EAA04316@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
"Christos Argyropoulos M.D." wrote: > > Hi, > > My problem is that I want to generate a large number of samples from Poisson > distributions. Specifically the user has to supply a list of the parameters > for a number of Poisson distributions and then generate a number of such > samples. In the function below poissonparams is a list of the parameters of > the Poisson distributions, and size is the number of samples. In my > application (Analysis of counts of digital gene expression profiles), I > supply a list of 10000 Poisson Parameters and have to generate a big sample > of numbers drawn from the 10000 distributions. Typically > 1k such samples > are required in order to establish a Log Likelihood statistic cutoff down > the road. > For the application at hand the poisson parameter can be zero (in this case > the only point I can sample is the number zero) that is way I do not Map . > My implementation shown below is ok, except for a small problem; it takes > too long. In order to generate 5000 samples from each of the 10000 > distributions (this had to be repeated thrice, since I had to analyse 3 > profiles) i.e. total 150 million random numbers my system took close to 13 > hours. The required overhead to periodically save the data to the HD is a > very small fraction of the total processing time and given limitations of > available RAM cannot be improved. > The question is, is there a faster way of doing this in Mathematica or > should I be prepared to move outside the kernel for the RNG part? > > I am using Mathematica 4.2 on a P4 at 2.2 GHz with 256 DDR RAM running > Windows XP > > Simulate[poissonparams_List,size_Integer]:= > Module[{}, > fnd[x_]:= > Map[If[#\[Equal]0,0,Random[PoissonDistribution[#]]]&, > poissonparams]; > Array[fnd,{size}] > ] > > example of using the code : > > In[40]:= > Simulate[{1,2,3,0.5,0},20] > > Out[40]= > {{1,1,3,1,0},{1,1,3,1,0},{0,3,1,0,0},{3,4,5,0,0},{2,0,3,0,0},{4,4,3,0,0},{0, > 2, > > 5,0,0},{2,1,2,1,0},{2,1,1,0,0},{2,2,2,2,0},{0,2,4,0,0},{0,2,1,0,0},{2,0,2, > > 1,0},{2,4,1,0,0},{0,2,6,2,0},{1,0,4,3,0},{0,2,1,0,0},{0,0,5,0,0},{1,0,3,0, > 0},{1,1,3,0,0}} > > I.e. generates 20 samples from the 5 Poisson Distributions with Parameter > lamda 1, 2, 3, 0.5, and zero > > Thanks In advance, > > Christos Argyropoulos I think something along the lines below may help. The underlying assumptions are as follows. (1) You want to generate random arrays, of reasonable length, with respect to given values of the Poisson parameter 'mu'. (2) Either you have "reasonable" values for 'mu' or else you know what will suffice to modify the code below. (3) You have sufficient expertise (since I do not) to figure out whether I made any mistakes below, and correct them if so. I did a few things. First, after looking at the code in Statistics`DiscreteDistributions.m, I decided that a slow step was incessant repetition of numeric evaluation of CDF values. Next I decided that provided we can fix in advance a maximum set of such required for a given 'mu' (I use 32 below), then we can use Compile to advantage. Below is code to do this. You will notice that it is similar to code in the above mentioned package but I have stripped out a few things such as parameter value checking. Hence it may have bad consequences if used with inappropriate values. myPoissonRandomArrayC = Compile[{{mu,_Real},{len,_Integer}}, Module[ {tabl, high, low, mid, res, q}, If[mu==0, Return[Table[0,{len}]]]; tabl = Table[GammaRegularized[1.+j,mu],{j,32}]; res = Table[ q = Random[]; If [Exp[-mu] < q, high = 1; While[tabl[[high]] < q, high *= 2]; low = Round[high/2]; While[high - low > 1, mid = Round[(high+low)/2]; If[tabl[[mid]] < q, low = mid, high = mid ] ]; high, 0 ], {len} ]; res ]]; To run an example similar to yours I need to find lists of random values for each given 'mu', then transpose. simulate2[pp_List, size_Integer] := Transpose[Map[myPoissonRandomArrayC[#, size]&, pp]] In[137]:= Timing[ss2 = simulate2[{1,2,3,0.5,0.}, 2^12];] Out[137]= {0.06 Second, Null} To compare timings: simulate[poissonparams_List,size_Integer] := Module[ {}, fnd[x_]:= Map[If[#==0,0,Random[PoissonDistribution[#]]]&, poissonparams]; Array[fnd,{size}] ] In[140]:= Timing[ss = simulate[{1,2,3,0.5,0.001}, 2^12];] Out[140]= {7.71 Second, Null} This roughly corresponds to your 13 hour result for generating 150 million, given that I am working on a 1.5 GHz machine. So using Compile and a table to store CDF values we can obtain a bit over two orders of magnitude improvement in speed. Daniel Lichtblau Wolfram Research
- References:
- Random Number Generation : Is there a faster way of doing this ?
- From: "Christos Argyropoulos M.D." <argchris@otenet.gr>
- Random Number Generation : Is there a faster way of doing this ?