MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Just some thousands of combinations...

  • To: mathgroup at smc.vnet.net
  • Subject: [mg109965] Re: Just some thousands of combinations...
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Wed, 26 May 2010 07:11:18 -0400 (EDT)

Scott Hemphill wrote:
> Patrick Scheibe <pscheibe at trm.uni-leipzig.de> writes:
> 
>> Hi,
>>
>> data = RandomSample[Subsets[Range[90], {5}], 3000];
>>
>> works fine here: "7.0 for Linux x86 (64-bit) (February 18, 2009)"
> 
> Yes, I have the same version.  After execution, MathKernel was using
> 3505MB of virtual memory.
> 
> Scott
> 
>> On Sun, 2010-05-23 at 03:17 -0400, Dr. Bruno Campanini wrote:
>>> Subsets[Range[90],{5}]  should print all the
>>> Binomial[90,5] = 43 949 268 combinations.
>>>
>>> On my PC (4gb RAM) Mathematica 7.0 claims
>>> "Not enough memory".
>>>
>>> By the way, I only need to print some thousands
>>> of these combinations, then Subsets[Range[90],{5}, 3000]
>>> works fine.
>>> But it prints the first 3000 combination in lexicographic
>>> order. How to get 3000 random out of 43 949 268, not
>>> ordered?
>>>
>>> Bruno 

I tried to reply to the original but that seems to be lost in the 
aether. Anyway, one can avoid the memory hit by picking a random sample 
of integers up to the size bound (number of subsets), then extracting 
just those subsets.

In[143]:= m = 10;
size = Binomial[90, 5];
randints = RandomInteger[size, m];
sample = Map[Subsets[Range[90], {5}, {#}] &, randints]

Out[145]= {{{16, 17, 20, 23, 84}}, {{8, 11, 13, 22, 28}}, {{6, 23, 34,
     49, 50}}, {{16, 21, 35, 42, 90}}, {{4, 17, 53, 55, 78}}, {{22, 48,
     56, 60, 76}}, {{20, 27, 39, 53, 87}}, {{1, 50, 55, 63, 72}}, {{1,
    14, 18, 38, 57}}, {{10, 14, 24, 36, 74}}}

If you require exactly m subsets, with no repeats, then you would want 
to either check that there were no duplicates, or perhaps generate one 
at a time and discard any duplicates that appear. A Birthday Paradox 
type of computation indicates that for a sample of 3000, the probability 
of no duplicates appearing is around 90%.

In[168]:= m = 3000;
prob = Product[1 - k/size, {k, m}];

In[169]:= N[prob]
Out[169]= 0.902644

Daniel Lichtblau
Wolfram Research


  • Prev by Date: Re: Dot product confusion
  • Next by Date: Some collected information about Dynamic
  • Previous by thread: Re: Just some thousands of combinations...
  • Next by thread: Dot product confusion