Summary: Random Sampling Without Replacement
- To: mathgroup at smc.vnet.net
- Subject: [mg26649] Summary: Random Sampling Without Replacement
- From: Jorma Virtamo <Jorma.Virtamo at hut.fi>
- Date: Sat, 13 Jan 2001 22:36:23 -0500 (EST)
- Sender: owner-wri-mathgroup at wolfram.com
"A. E. Siegman" wrote:
> Looking for neat compact way to extract three distinct (i.e., nonequal)
> randomly selected integers k1, k2, k3 from the range 1 to N (N > 3) --
> in other words, random sampling without replacement -- ???
Below is a speed comparisons between four proposed methods (one of them
being the use of a built-in function, pointed out by several people in
the group). Three runs are made for each method, representing three draws
without replacement (repeated 10000 times) from a set of numbers in a range
from 1 to 10, 1 to 100, or 1 to 1000:
Method 1 (by Jorma Virtamo)
--------
rnd[n_] :=
Module[{k1,k2,k3},
k1=Random[Integer,{1,n}];
k2=Random[Integer,{1,n-1}]; If[k2>=k1,k2++];
k3=Random[Integer,{1,n-2}]; If[k3>=Min[k1,k2], k3++; If[k3>=Max[k1,k2], k3++]];
{k1,k2,k3}]
Do[rnd[10],{10000}]//Timing//First
8.76632 Second
Do[rnd[100],{10000}]//Timing//First
8.66632 Second
Do[rnd[1000],{10000}]//Timing//First
8.79965 Second
Method 2 (by Matt Johnson)
--------
ran[list_List, num_Integer] := Module[{list1},
list1 = list;
Do[list1 = Drop[list1, {Random[Integer, {1, Length[list1]}]}], {num}];
Complement[list, list1]]
Do[ran[Range[10],3],{10000}]//Timing//First
7.89968 Second
Do[ran[Range[100],3],{10000}]//Timing//First
10.8162 Second
Do[ran[Range[1000],3],{10000}]//Timing//First
51.4813 Second
Method 3 (built-in function)
--------
<< DiscreteMath`Combinatorica`
Do[RandomKSubset[Range[10],3],{10000}]//Timing//First
15.3161 Second
Do[RandomKSubset[Range[100],3],{10000}]//Timing//First
15.3827 Second
Do[RandomKSubset[Range[1000],3],{10000}]//Timing//First
16.766 Second
Method 4 (by Daniel Lichtblau)
--------
SetAttributes[deal, HoldFirst];
deal[deck_, start_, end_] := Module[
{len=Length[deck], rand},
Do [
rand = Random[Integer,{j,len}];
deck[[{j,rand}]] = deck[[{rand,j}]],
{j,start,end}];
Take[deck, {start,end}]
]
deck = Range[10];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.2663 Second
deck = Range[100];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.1829 Second
deck = Range[1000];
Do[deal[deck, 1, 3],{10000}]//Timing//First
10.3496 Second
Summary
-------
In general, method 1 is the fastest one, followed closely by method 4,
and then by method 3. Method 3 (the built-in function) takes about twice
the time in comparison with method 1. For all these three methods the
running time is essentially independent of n (the range of values).
Method 2 works well for small n, even beating method 1, but the running
time depends strongly on n.
Jorma Virtamo