MathGroup Archive 2004

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: shuffling 10^8 numbers

  • To: mathgroup at smc.vnet.net
  • Subject: [mg53199] Re: [mg53180] shuffling 10^8 numbers
  • From: DrBob <drbob at bigfoot.com>
  • Date: Tue, 28 Dec 2004 23:12:20 -0500 (EST)
  • References: <200412281130.GAA26970@smc.vnet.net>
  • Reply-to: drbob at bigfoot.com
  • Sender: owner-wri-mathgroup at wolfram.com

AA, OrigList, p, and ShuffledList each contain 10^8 numbers, so you're potentially using 4 times as much memory as you need. Table[aa] is just aa, so that's really a wasted statement, and a wasted copy of a very long list. (It's an UNDOCUMENTED FORM for Table, too!!)

Try this instead:

shuffledList = #[[RandomPermutation@Length@#]] &[ReadList["gg.txt"]];
Short[%, 7]

Here are a couple of trials where aa = Range[n] rather than coming from a file.

Quit

<<DiscreteMath`Combinatorica`
n=10^7;
Timing[aa=Range[n];origList=Table[aa];p=RandomPermutation[Length[origList]];\
shuffledList=origList[[p]];]
MemoryInUse[]

{12.203 Second,Null}

124950184

Quit

<<DiscreteMath`Combinatorica`
n=10^7;
Timing[shuffledList=Range[n][[RandomPermutation[n]]];]
MemoryInUse[]

{12.328 Second,Null}

44948736

N[124950184/44948736]

2.77984

So your code uses almost three times as much memory.

Sadly, however, BOTH codes run out of memory for n=10^8 on my machine.

Bobby

On Tue, 28 Dec 2004 06:30:22 -0500 (EST), George Szpiro <george at netvision.net.il> wrote:

> Hi,
>
> I am trying to shuffle 10^8 numbers stored in the file GG.doc in the root
> directory. (Size of GG.doc appros 360 MB)
>
> Accorrding to previous suggestions from this group I try to shuffle them
> witht he following program:
>
> GG=OpenRead["c:\GG.doc"];
> AA=ReadList[GG];
>   Timing[
>   OrigList=Table[AA];
>   p=RandomPermutation@Length@OrigList;
>   ShuffledList=OrigList[[p]];
>
>
> But the file is far too big. I can read it but then I get the following
> error message:
>
> <<No more memory available. Mathematica kernel has shut down. Try quitting
> other applications and then retry.>>
>
> No other programs are open, so I guess I am at the limit. Can anybody
> suggest a workaround? Is there a possibility to shuffle numbers without
> loading them all into memory simultaneously?
>
> NEW IDEA: I thought there might be a possibility of just reading one single
> number each time from the file GG.doc, and putting them into a randomly
> chosen slot in a new file.
>
> Any answeres greatly appreciated to:
> george at netvision.net.il
>
> Thanks,
> George
>
>
>
>
>



-- 
DrBob at bigfoot.com
www.eclecticdreams.net


  • Prev by Date: Re: N[] does not work inside Replace[] ?
  • Next by Date: Re: shuffling 10^8 numbers
  • Previous by thread: Re: shuffling 10^8 numbers
  • Next by thread: Re: shuffling 10^8 numbers