MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: shuffling 10^8 numbers

  • To: mathgroup at smc.vnet.net
  • Subject: [mg53307] Re: [mg53276] shuffling 10^8 numbers
  • From: DrBob <drbob at bigfoot.com>
  • Date: Wed, 5 Jan 2005 01:21:28 -0500 (EST)
  • References: <200501040812.DAA27147@smc.vnet.net>
  • Reply-to: drbob at bigfoot.com
  • Sender: owner-wri-mathgroup at wolfram.com

If you're shuffling this deck, is that because you want to sample without replacement?

You may be better off setting up a discrete probability function, rather than shuffling the deck at all... especially if you can stand to sample WITH replacement.

Bobby

On Tue, 4 Jan 2005 03:12:52 -0500 (EST), George Szpiro <george at netvision.net.il> wrote:

> Hello:
>
> Daniel's suggestion is exactly what I was looking for. but there is a
> problem:
>
> When I do the shuffle from the integers that are stored in the file GG.doc,
> the time needed to do the shuffle increases by the SQUARE of the input size.
> For 10,000 integers it took 3.3 seconds, for 100,000 integers it took 314
> seconds. So it would be impossible to do it for 10^8 numbers.
>
> Any suggestion?
>
> Why did Daniel's shuffle only take 900 seconds for 10^8 real numbers (see
> below).
>
> Also: can I time intermediate progress, e.g., after every 1000 shuffles?
>
> Thanks,
> George
>
>
> Clear[SSS,aa];
> SSS=100000;
>
> aa=ReadList["c:GG.doc",Number,SSS];
>
> SetAttributes[shuffleSet, HoldFirst];
> shuffleSet[LL_] := Module[{n=SSS, rand},
>     Do[
>         rand = Random[Integer, {j,n}];
>         LL[[{j,rand}]] = LL[[{rand,j}]],
>         {j,n}];
>     ]
>
> Timing[shuffleSet[aa]]
>
> <<Utilities`CleanSlate`
>
>  OpenWrite["c:ShuffledGG.doc"];
> Put[OutputForm[aa],"c:ShuffledGG.doc"]
> Close["c:ShuffledGG.doc"];
>
> Null
> {314.292 Second,Null}
>
>
>>
>> My prior reply I point out that the memory requirements for generating a
>> permutation and keeping both list and shuffled list can be close to the
>> maximum allowable, if the list consists of machine reals. I should also
>> point out that a purely in-place shuffle can manage 10^8 machine reals.
>> Below code will do this. It is slow because we do not want to copy the
>> original list, and hence lose the ability to use Compile.
>>
>> SetAttributes[shuffleSet, HoldFirst];
>> shuffleSet[ll_] := Module[{n=Length[ll], rand},
>>   Do[
>>     rand = Random[Integer, {j,n}];
>>     ll[[{j,rand}]] = ll[[{rand,j}]],
>>     {j, n}];
>>   ]
>>
>> len = 10^8;
>> ll = Table[Random[],{len}];
>>
>> Timing[shuffleSet[ll]]
>> Out[6]= {904. Second, Null}
>>
>> A drawback is that there may be a copy of the original list lying around
>> even though we did the shuffling in place. Use of
>>
>> <<Utilities`CleanSlate`
>>
>> in the AddOns:ExtraPackages directory might help to prevent this.
>>
>>
>> Daniel Lichtblau
>> Wolfram Research
>>
>>
>>
>>
>>
>
>
>
>
>



-- 
DrBob at bigfoot.com
www.eclecticdreams.net


  • Prev by Date: Re: Shuffling 10^8 numbers: memory problems
  • Next by Date: Re: Spherical density visualization
  • Previous by thread: shuffling 10^8 numbers
  • Next by thread: LaplaceTransform TraditionalForm symbol