Re: shuffling 10^8 numbers

*To*: mathgroup at smc.vnet.net*Subject*: [mg53307] Re: [mg53276] shuffling 10^8 numbers*From*: DrBob <drbob at bigfoot.com>*Date*: Wed, 5 Jan 2005 01:21:28 -0500 (EST)*References*: <200501040812.DAA27147@smc.vnet.net>*Reply-to*: drbob at bigfoot.com*Sender*: owner-wri-mathgroup at wolfram.com

If you're shuffling this deck, is that because you want to sample without replacement? You may be better off setting up a discrete probability function, rather than shuffling the deck at all... especially if you can stand to sample WITH replacement. Bobby On Tue, 4 Jan 2005 03:12:52 -0500 (EST), George Szpiro <george at netvision.net.il> wrote: > Hello: > > Daniel's suggestion is exactly what I was looking for. but there is a > problem: > > When I do the shuffle from the integers that are stored in the file GG.doc, > the time needed to do the shuffle increases by the SQUARE of the input size. > For 10,000 integers it took 3.3 seconds, for 100,000 integers it took 314 > seconds. So it would be impossible to do it for 10^8 numbers. > > Any suggestion? > > Why did Daniel's shuffle only take 900 seconds for 10^8 real numbers (see > below). > > Also: can I time intermediate progress, e.g., after every 1000 shuffles? > > Thanks, > George > > > Clear[SSS,aa]; > SSS=100000; > > aa=ReadList["c:GG.doc",Number,SSS]; > > SetAttributes[shuffleSet, HoldFirst]; > shuffleSet[LL_] := Module[{n=SSS, rand}, > Do[ > rand = Random[Integer, {j,n}]; > LL[[{j,rand}]] = LL[[{rand,j}]], > {j,n}]; > ] > > Timing[shuffleSet[aa]] > > <<Utilities`CleanSlate` > > OpenWrite["c:ShuffledGG.doc"]; > Put[OutputForm[aa],"c:ShuffledGG.doc"] > Close["c:ShuffledGG.doc"]; > > Null > {314.292 Second,Null} > > >> >> My prior reply I point out that the memory requirements for generating a >> permutation and keeping both list and shuffled list can be close to the >> maximum allowable, if the list consists of machine reals. I should also >> point out that a purely in-place shuffle can manage 10^8 machine reals. >> Below code will do this. It is slow because we do not want to copy the >> original list, and hence lose the ability to use Compile. >> >> SetAttributes[shuffleSet, HoldFirst]; >> shuffleSet[ll_] := Module[{n=Length[ll], rand}, >> Do[ >> rand = Random[Integer, {j,n}]; >> ll[[{j,rand}]] = ll[[{rand,j}]], >> {j, n}]; >> ] >> >> len = 10^8; >> ll = Table[Random[],{len}]; >> >> Timing[shuffleSet[ll]] >> Out[6]= {904. Second, Null} >> >> A drawback is that there may be a copy of the original list lying around >> even though we did the shuffling in place. Use of >> >> <<Utilities`CleanSlate` >> >> in the AddOns:ExtraPackages directory might help to prevent this. >> >> >> Daniel Lichtblau >> Wolfram Research >> >> >> >> >> > > > > > -- DrBob at bigfoot.com www.eclecticdreams.net

**References**:**shuffling 10^8 numbers***From:*George Szpiro <george@netvision.net.il>