MathGroup Archive 2001

[Date Index] [Thread Index] [Author Index]

Search the Archive

Summary: Random Sampling Without Replacement

  • To: mathgroup at
  • Subject: [mg26649] Summary: Random Sampling Without Replacement
  • From: Jorma Virtamo <Jorma.Virtamo at>
  • Date: Sat, 13 Jan 2001 22:36:23 -0500 (EST)
  • Sender: owner-wri-mathgroup at

"A. E. Siegman" wrote:
> Looking for neat compact way to extract three distinct (i.e., nonequal)
> randomly selected integers k1, k2, k3 from the range 1 to N (N > 3) --
> in other words, random sampling without replacement -- ???

Below is a speed comparisons between four proposed methods (one of them
being the use of a built-in function, pointed out by several people in 
the group). Three runs are made for each method, representing three draws 
without replacement (repeated 10000 times) from a set of numbers in a range 
from 1 to 10, 1 to 100, or 1 to 1000:

Method 1 (by Jorma Virtamo)

rnd[n_] := 
k2=Random[Integer,{1,n-1}]; If[k2>=k1,k2++];
k3=Random[Integer,{1,n-2}]; If[k3>=Min[k1,k2], k3++; If[k3>=Max[k1,k2], k3++]];

8.76632 Second

8.66632 Second

8.79965 Second

Method 2  (by Matt Johnson)

ran[list_List, num_Integer] := Module[{list1},
    list1 = list;
    Do[list1 = Drop[list1, {Random[Integer, {1, Length[list1]}]}], {num}];
    Complement[list, list1]]

7.89968 Second

10.8162 Second

51.4813 Second

Method 3 (built-in function)

<< DiscreteMath`Combinatorica`

15.3161 Second

15.3827 Second

16.766 Second

Method 4 (by Daniel Lichtblau)

SetAttributes[deal, HoldFirst];
deal[deck_, start_, end_] := Module[
        {len=Length[deck], rand},
        Do [
                rand = Random[Integer,{j,len}];
                deck[[{j,rand}]] = deck[[{rand,j}]],
        Take[deck, {start,end}]

deck = Range[10];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.2663 Second

deck = Range[100];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.1829 Second

deck = Range[1000];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.3496 Second


In general, method 1 is the fastest one, followed closely by method 4,
and then by method 3. Method 3 (the built-in function) takes about twice
the time in comparison with method 1. For all these three methods the 
running time is essentially independent of n (the range of values). 
Method 2 works well for small n, even beating method 1, but the running
time depends strongly on n.

Jorma Virtamo

  • Prev by Date: Re: Re: Sum of Squares
  • Next by Date: Re: Random Sampling Without Replacement?
  • Previous by thread: RE: RE: Fast Integration of a product of 2 interpolating function s
  • Next by thread: Finding what is in a package