MathGroup Archive 2000

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Optimizing a simple sampling function for speed

  • To: mathgroup at smc.vnet.net
  • Subject: [mg25387] Re: [mg25343] Optimizing a simple sampling function for speed
  • From: "Rosa Ma. Seco" <ceie at prodigy.net.mx>
  • Date: Fri, 29 Sep 2000 01:06:18 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

It seems to me that your dist has only 100 elements?

In[1]:=
dist = Table[Random[Integer, {1, 100}], {100}];

In[2]:=
Length[dist]
Out[2]=
100

Anyway, one possibility for sampling without replacement is using the
function RandomKSubset from the add-on DiscreteMath`Combinatorica`:

In[3]:=
<<DiscreteMath`Combinatorica`

Take dist2 now with 5000 elements:
In[4]:=
dist2=
Table[Random[Integer,{1,10000}],{5000}];

Take a sample of 1000 without replacement:

In[5]:=
RandomKSubset[dist2,1000];//Timing
Out[5]=
{1.38 Second,Null}

I'm sure this can be improved upon, but I'd like to have a clearer picture
of the problem.

Tomas Garza
Mexico City

Kimberly N. Russell [mailto:norris at cerc.columbia.edu] wrote:

> In a lot of my work I have need for a function that draws a
> sample of size n
> without replacement from a set of objects in bins, where the
> distribution is
> given as a simple list. This function is used in simulations, so speed and
> efficiency are important. Here is my attempt at doing this in Mathematica.
>
> First, define a function that removes one number at random from a list
> without replacement.
>
> In[1]:=
> removeone[l_] := Module[{chosenobs, chosencat},
>     chosenobs = Random[]*Fold[Plus, 0, l];
>     chosencat =
>       Extract[Position[Drop[FoldList[Plus, 0, l], 1],
>           x_ /; x ? chosenobs], {1, 1}];
>     ReplacePart[l, l[[chosencat]] - 1, chosencat]
>     ]
>
> Nesting this allows a sample of size n to be removed, and subtracting that
> from the original gives you the sample itself.
>
> In[2]:=
> randsampcat[l_, n_] := l - Nest[removeone, l, n]
>
> So, create a distribution with approx 5000 objects in 1000 bins...
>
> In[10]:=
> dist = Table[Random[Integer, {1, 100}], {100}];
>
> ...and sample 1000 of them.
>
> In[9]:=
> randsampcat[dist, 1000]; // Timing
>
> Out[9]=
> {3.96667 Second, Null}
>
> I am sure that this can be improved upon. Can any of you
> Mathematica wizards
> show me how? I tried compiling removeone, but got the following
> errors which
> I don't really understand.



  • Prev by Date: Finding all roots of a group of equations
  • Next by Date: LegendreP & Gauss quad bug
  • Previous by thread: Optimizing a simple sampling function for speed
  • Next by thread: Fermat Surfaces