Services & Resources / Wolfram Forums
-----
 /
MathGroup Archive
2004
*January
*February
*March
*April
*May
*June
*July
*August
*September
*October
*November
*December
*Archive Index
*Ask about this page
*Print this page
*Give us feedback
*Sign up for the Wolfram Insider

MathGroup Archive 2004

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: newbie is looking for a customDistribution function

  • To: mathgroup at smc.vnet.net
  • Subject: [mg50477] Re: newbie is looking for a customDistribution function
  • From: koopman at sfu.ca (Ray Koopman)
  • Date: Mon, 6 Sep 2004 03:59:34 -0400 (EDT)
  • References: <200409032122.i83LM3O2023123@rm-rstar.sfu.ca> <chbm9p$rpj$1@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

DrBob <drbob at bigfoot.com> wrote in message news:<chbm9p$rpj$1 at smc.vnet.net>...
> I'm sure the Alias Method is brilliant, 
> but I'd like to see an implementation.

This sets up for generating by the Alias method.
p is the vector of probabilities, which must be > 0 and add to 1.

In[1]:= setal[p_] := Block[{n = Length[p],a,b,c,i,j,tol},
        a = Range[n]; b = n*p - 1; c = Table[1,{n}];
        tol = 2n*$MachineEpsilon (* or some such value *);
        While[{i,j} = Ordering[b][[{1,-1}]]; b[[j]] > b[[i]] + tol,
              a[[i]] = j; c[[i]] = 1 + b[[i]];
              b[[j]] += b[[i]]; b[[i]] = 0];
        {n,a,N[c]}]

n is the number of terms in p. 
a is the vector of aliases.
c is the vector of comparands that are used
to determine whether or not to use an alias.

This will get a random observation 1...n 
with the probabilities specified in p:

   (If[Random[] > c[[#]], a[[#]], #]&)[Random[Integer,{1,n}]]

This creates a random p and sets up to generate from it
by both the Alias and zero-order-interpolation methods.

In[2]:= p = #/Tr@#&[Table[Log[Random[]],{30}]]; {n,a,c} = setal[p];
        inverse = Interpolation[Transpose@{FoldList[Plus,0,p],
                        Range[0,n]}, InterpolationOrder -> 0];

This generates a million observations twice: 
first using the Alias method, then using zero-order interpolation.

In[4]:= samples = 1*^6; 
       {First@Timing[x = Table[(If[Random[] > c[[#]],
             a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]], 
        Max@Abs[Count[x,#]&/@Range@n/samples-p]}
       {First@Timing[y = Table[inverse@Random[],{samples}]],
        Max@Abs[Count[y,#]&/@N@Range@n/samples-p]}

Out[5]= {2.25 Second, 0.000504687}
Out[6]= {6.35 Second, 0.000454016}

The Alias method slows some when n is large,
but only because Random[Integer,{1,n}] slows down.

In[7]:= p = #/Tr@#&[Table[Log[Random[]],{300}]]; {n,a,c} = setal[p];
        inverse = Interpolation[Transpose@{FoldList[Plus,0,p],
                        Range[0,n]}, InterpolationOrder -> 0];

In[9]:= samples = 1*^6; 
       {First@Timing[x = Table[(If[Random[] > c[[#]],
             a[[#]],#]&)[Random[Integer,{1,n}]],{samples}]], 
        Max@Abs[Count[x,#]&/@Range@n/samples-p]}
       {First@Timing[y = Table[inverse@Random[],{samples}]],
        Max@Abs[Count[y,#]&/@N@Range@n/samples-p]}

Out[10]= {3.62 Second, 0.000231153}
Out[11]= {6.4 Second, 0.000240492}

In both cases, the Alias times are overstated by ~.10 seconds;
see my Aug 28 post "Timing anomaly".


  • Prev by Date: Re: HoldPattern & Pattern Matching
  • Next by Date: ParametricPlot and legends
  • Previous by thread: Re: Re: newbie is looking for a customDistribution function
  • Next by thread: Re: Re: Re: Beware of NSolve - nastier example