Summary: Random Sampling Without Replacement

*To*: mathgroup at smc.vnet.net*Subject*: [mg26649] Summary: Random Sampling Without Replacement*From*: Jorma Virtamo <Jorma.Virtamo at hut.fi>*Date*: Sat, 13 Jan 2001 22:36:23 -0500 (EST)*Sender*: owner-wri-mathgroup at wolfram.com

"A. E. Siegman" wrote: > Looking for neat compact way to extract three distinct (i.e., nonequal) > randomly selected integers k1, k2, k3 from the range 1 to N (N > 3) -- > in other words, random sampling without replacement -- ??? Below is a speed comparisons between four proposed methods (one of them being the use of a built-in function, pointed out by several people in the group). Three runs are made for each method, representing three draws without replacement (repeated 10000 times) from a set of numbers in a range from 1 to 10, 1 to 100, or 1 to 1000: Method 1 (by Jorma Virtamo) -------- rnd[n_] := Module[{k1,k2,k3}, k1=Random[Integer,{1,n}]; k2=Random[Integer,{1,n-1}]; If[k2>=k1,k2++]; k3=Random[Integer,{1,n-2}]; If[k3>=Min[k1,k2], k3++; If[k3>=Max[k1,k2], k3++]]; {k1,k2,k3}] Do[rnd[10],{10000}]//Timing//First 8.76632 Second Do[rnd[100],{10000}]//Timing//First 8.66632 Second Do[rnd[1000],{10000}]//Timing//First 8.79965 Second Method 2 (by Matt Johnson) -------- ran[list_List, num_Integer] := Module[{list1}, list1 = list; Do[list1 = Drop[list1, {Random[Integer, {1, Length[list1]}]}], {num}]; Complement[list, list1]] Do[ran[Range[10],3],{10000}]//Timing//First 7.89968 Second Do[ran[Range[100],3],{10000}]//Timing//First 10.8162 Second Do[ran[Range[1000],3],{10000}]//Timing//First 51.4813 Second Method 3 (built-in function) -------- << DiscreteMath`Combinatorica` Do[RandomKSubset[Range[10],3],{10000}]//Timing//First 15.3161 Second Do[RandomKSubset[Range[100],3],{10000}]//Timing//First 15.3827 Second Do[RandomKSubset[Range[1000],3],{10000}]//Timing//First 16.766 Second Method 4 (by Daniel Lichtblau) -------- SetAttributes[deal, HoldFirst]; deal[deck_, start_, end_] := Module[ {len=Length[deck], rand}, Do [ rand = Random[Integer,{j,len}]; deck[[{j,rand}]] = deck[[{rand,j}]], {j,start,end}]; Take[deck, {start,end}] ] deck = Range[10]; Do[deal[deck, 1, 3],{10000}]//Timing//First 10.2663 Second deck = Range[100]; Do[deal[deck, 1, 3],{10000}]//Timing//First 10.1829 Second deck = Range[1000]; Do[deal[deck, 1, 3],{10000}]//Timing//First 10.3496 Second Summary ------- In general, method 1 is the fastest one, followed closely by method 4, and then by method 3. Method 3 (the built-in function) takes about twice the time in comparison with method 1. For all these three methods the running time is essentially independent of n (the range of values). Method 2 works well for small n, even beating method 1, but the running time depends strongly on n. Jorma Virtamo