Re: newbie is looking for a customDistribution function
- To: mathgroup at smc.vnet.net
- Subject: [mg50477] Re: newbie is looking for a customDistribution function
- From: koopman at sfu.ca (Ray Koopman)
- Date: Mon, 6 Sep 2004 03:59:34 -0400 (EDT)
- References: <200409032122.i83LM3O2023123@rm-rstar.sfu.ca> <chbm9p$rpj$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
DrBob <drbob at bigfoot.com> wrote in message news:<chbm9p$rpj$1 at smc.vnet.net>... > I'm sure the Alias Method is brilliant, > but I'd like to see an implementation. This sets up for generating by the Alias method. p is the vector of probabilities, which must be > 0 and add to 1. In[1]:= setal[p_] := Block[{n = Length[p],a,b,c,i,j,tol}, a = Range[n]; b = n*p - 1; c = Table[1,{n}]; tol = 2n*$MachineEpsilon (* or some such value *); While[{i,j} = Ordering[b][[{1,-1}]]; b[[j]] > b[[i]] + tol, a[[i]] = j; c[[i]] = 1 + b[[i]]; b[[j]] += b[[i]]; b[[i]] = 0]; {n,a,N[c]}] n is the number of terms in p. a is the vector of aliases. c is the vector of comparands that are used to determine whether or not to use an alias. This will get a random observation 1...n with the probabilities specified in p: (If[Random[] > c[[#]], a[[#]], #]&)[Random[Integer,{1,n}]] This creates a random p and sets up to generate from it by both the Alias and zero-order-interpolation methods. In[2]:= p = #/Tr@#&[Table[Log[Random[]],{30}]]; {n,a,c} = setal[p]; inverse = Interpolation[Transpose@{FoldList[Plus,0,p], Range[0,n]}, InterpolationOrder -> 0]; This generates a million observations twice: first using the Alias method, then using zero-order interpolation. In[4]:= samples = 1*^6; {First@Timing[x = Table[(If[Random[] > c[[#]], a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]], Max@Abs[Count[x,#]&/@Range@n/samples-p]} {First@Timing[y = Table[inverse@Random[],{samples}]], Max@Abs[Count[y,#]&/@N@Range@n/samples-p]} Out[5]= {2.25 Second, 0.000504687} Out[6]= {6.35 Second, 0.000454016} The Alias method slows some when n is large, but only because Random[Integer,{1,n}] slows down. In[7]:= p = #/Tr@#&[Table[Log[Random[]],{300}]]; {n,a,c} = setal[p]; inverse = Interpolation[Transpose@{FoldList[Plus,0,p], Range[0,n]}, InterpolationOrder -> 0]; In[9]:= samples = 1*^6; {First@Timing[x = Table[(If[Random[] > c[[#]], a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]], Max@Abs[Count[x,#]&/@Range@n/samples-p]} {First@Timing[y = Table[inverse@Random[],{samples}]], Max@Abs[Count[y,#]&/@N@Range@n/samples-p]} Out[10]= {3.62 Second, 0.000231153} Out[11]= {6.4 Second, 0.000240492} In both cases, the Alias times are overstated by ~.10 seconds; see my Aug 28 post "Timing anomaly".