MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Re: easy question about random numbers

  • To: mathgroup at smc.vnet.net
  • Subject: [mg53547] Re: [mg53516] Re: easy question about random numbers
  • From: DrBob <drbob at bigfoot.com>
  • Date: Sun, 16 Jan 2005 22:24:04 -0500 (EST)
  • References: <crqb7f$cak$1@smc.vnet.net><200501110631.BAA00684@smc.vnet.net> <cs5bag$3t5$1@smc.vnet.net> <200501150644.BAA25355@smc.vnet.net>
  • Reply-to: drbob at bigfoot.com
  • Sender: owner-wri-mathgroup at wolfram.com

Compile is expected to speed up the alias method, of course. It sort of goes without saying!

When I timed the code, however, trials fairly often had 'alii' beating 'calii'--which is really counter-intuitive!--and neither was consistently faster than Mark Fisher's code. More often than not, in fact, Mark's code won by a little. (I'm not testing a wide range of distributions or sample sizes, admittedly.)

Mark's code requires a table lookup based on the PDF for every sample, and the alias method does not--but, on the other hand, the alias method always requries indexing into the 'c' array and frequently requires a second call to Random followed by indexing into the 'a' array. (Both require at least one call to Random.) The number of values in the PDF is the same as the length of 'a' or 'c', so comparing the algorithms theoretically is a wash, in my mind. Hence what matters is how well we take advantage of built-in routines. Table lookup on the PDF used to be slow, but now we have PieceWise, which seems well optimized for this application.

Now that I understand it, I prefer the alias method for its elegance, really--but the other solution is undeniably more straightforward.

Best to have both ideas in my bag of tricks, I suppose; for some distributions (with large numbers of discrete values, perhaps) the alias method may still be wildly superior. (Although both are so fast, it's hard to imagine a case where it would matter very much.)

Bobby

On Sat, 15 Jan 2005 01:44:18 -0500 (EST), Ray Koopman <koopman at sfu.ca> wrote:

> DrBob wrote:
>> Wow, PieceWise and Compile really do the job; that's quite a bit
>> faster than the alias method, even when ranfun is used to get only
>> one sample at a time. For a Histogram of 100000 samples: [...]
>
> The alias method, too, runs faster as compiled vector-generating code.
> (I apologize for the absence of indentation in what follows, but I'm
> posting via Google, which now removes all indentation.)
>
> In[1]:=
> <<Statistics`DiscreteDistributions`
> <<Graphics`Graphics`
>
> In[3]:=
> MakeDiscreteRandomFunction[probs_, vals_] := With[{fun = Block[{x},
> Function @@ {x, Piecewise@Transpose[{vals, x <= # & /@
> (Rest@FoldList[Plus, 0, probs])}]}]},
> Compile[{{n, _Integer}}, fun /@ Table[Random[], {n}]]]
> MakeDiscreteRandomFunction[probs_] :=
> MakeDiscreteRandomFunction[probs,Range[Length[probs]]]
>
> In[5]:=
> k = 7;
> dist = BinomialDistribution[k, 3/10];
> Timing[ranfun = MakeDiscreteRandomFunction@Table[PDF[dist,i],{i,0,k}];]
> Timing@Histogram@Flatten@Array[ranfun@1&,1*^5] (* one sample at a time
> *)
> Timing@Histogram@ranfun@1*^5 (* all the samples at once *)
>
> Out[7]= {0. Second, Null}
> Out[8]= {1.19 Second, -Graphics-}
> Out[9]= {0.29 Second, -Graphics-}
>
> In[10]:=
> setal[p_]:=Block[{n=Length[p],a,b,c,i,j,tol},
> a=Range[n];b=n*p-1;c=Table[1,{n}];tol=2*n*$MachineEpsilon;
> While[{i,j}=Ordering[b][[{1,-1}]];b[[j]]>b[[i]]+tol,
> a[[i]]=j;c[[i]]=1+b[[i]];b[[j]]+=b[[i]];b[[i]]=0]; {n,a,N[c]}]
> Clear@alias
> alias[___] := (-1+If[Random[]>c[[#]],a[[#]],#]&)[Random[Integer,{1,n}]]
> Clear@alii
> alii[samples_] := -1 + Table[(If[Random[] >
> c[[#]],a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]
> Clear@calii
> calii = Compile[{{n,_Integer},{a,_Integer,1},{c,_Real,1},
> {samples,_Integer}}, -1 + Table[(If[Random[] >
> c[[#]],a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]]
>
> In[17]:=
> Timing[{n, a, c} = setal@Table[PDF[dist, i], {i, 0, k}];]
> Timing@Histogram@Array[alias, {1*^5}]  (* one at a time *)
> Timing@Histogram@alii@1*^5           (*    all at once    *)
> Timing@Histogram@calii[n,a,c,1*^5] (* all at once, compiled *)
>
> Out[17]= {0. Second, Null}
> Out[18]= {1.74 Second, -Graphics-}
> Out[19]= {0.31 Second, -Graphics-}
> Out[20]= {0.27 Second, -Graphics-}
>
>
>
>



-- 
DrBob at bigfoot.com
www.eclecticdreams.net


  • Prev by Date: Re: Colored Surface
  • Next by Date: Re: NIntegrate with Minimize
  • Previous by thread: Re: easy question about random numbers
  • Next by thread: Re: easy question about random numbers