Re: shuffling 10^8 numbers
- To: mathgroup at smc.vnet.net
- Subject: [mg53307] Re: [mg53276] shuffling 10^8 numbers
- From: DrBob <drbob at bigfoot.com>
- Date: Wed, 5 Jan 2005 01:21:28 -0500 (EST)
- References: <200501040812.DAA27147@smc.vnet.net>
- Reply-to: drbob at bigfoot.com
- Sender: owner-wri-mathgroup at wolfram.com
If you're shuffling this deck, is that because you want to sample without replacement?
You may be better off setting up a discrete probability function, rather than shuffling the deck at all... especially if you can stand to sample WITH replacement.
Bobby
On Tue, 4 Jan 2005 03:12:52 -0500 (EST), George Szpiro <george at netvision.net.il> wrote:
> Hello:
>
> Daniel's suggestion is exactly what I was looking for. but there is a
> problem:
>
> When I do the shuffle from the integers that are stored in the file GG.doc,
> the time needed to do the shuffle increases by the SQUARE of the input size.
> For 10,000 integers it took 3.3 seconds, for 100,000 integers it took 314
> seconds. So it would be impossible to do it for 10^8 numbers.
>
> Any suggestion?
>
> Why did Daniel's shuffle only take 900 seconds for 10^8 real numbers (see
> below).
>
> Also: can I time intermediate progress, e.g., after every 1000 shuffles?
>
> Thanks,
> George
>
>
> Clear[SSS,aa];
> SSS=100000;
>
> aa=ReadList["c:GG.doc",Number,SSS];
>
> SetAttributes[shuffleSet, HoldFirst];
> shuffleSet[LL_] := Module[{n=SSS, rand},
> Do[
> rand = Random[Integer, {j,n}];
> LL[[{j,rand}]] = LL[[{rand,j}]],
> {j,n}];
> ]
>
> Timing[shuffleSet[aa]]
>
> <<Utilities`CleanSlate`
>
> OpenWrite["c:ShuffledGG.doc"];
> Put[OutputForm[aa],"c:ShuffledGG.doc"]
> Close["c:ShuffledGG.doc"];
>
> Null
> {314.292 Second,Null}
>
>
>>
>> My prior reply I point out that the memory requirements for generating a
>> permutation and keeping both list and shuffled list can be close to the
>> maximum allowable, if the list consists of machine reals. I should also
>> point out that a purely in-place shuffle can manage 10^8 machine reals.
>> Below code will do this. It is slow because we do not want to copy the
>> original list, and hence lose the ability to use Compile.
>>
>> SetAttributes[shuffleSet, HoldFirst];
>> shuffleSet[ll_] := Module[{n=Length[ll], rand},
>> Do[
>> rand = Random[Integer, {j,n}];
>> ll[[{j,rand}]] = ll[[{rand,j}]],
>> {j, n}];
>> ]
>>
>> len = 10^8;
>> ll = Table[Random[],{len}];
>>
>> Timing[shuffleSet[ll]]
>> Out[6]= {904. Second, Null}
>>
>> A drawback is that there may be a copy of the original list lying around
>> even though we did the shuffling in place. Use of
>>
>> <<Utilities`CleanSlate`
>>
>> in the AddOns:ExtraPackages directory might help to prevent this.
>>
>>
>> Daniel Lichtblau
>> Wolfram Research
>>
>>
>>
>>
>>
>
>
>
>
>
--
DrBob at bigfoot.com
www.eclecticdreams.net
- References:
- shuffling 10^8 numbers
- From: George Szpiro <george@netvision.net.il>
- shuffling 10^8 numbers