Re: shuffling 10^8 numbers
- To: mathgroup at smc.vnet.net
- Subject: [mg53199] Re: [mg53180] shuffling 10^8 numbers
- From: DrBob <drbob at bigfoot.com>
- Date: Tue, 28 Dec 2004 23:12:20 -0500 (EST)
- References: <200412281130.GAA26970@smc.vnet.net>
- Reply-to: drbob at bigfoot.com
- Sender: owner-wri-mathgroup at wolfram.com
AA, OrigList, p, and ShuffledList each contain 10^8 numbers, so you're potentially using 4 times as much memory as you need. Table[aa] is just aa, so that's really a wasted statement, and a wasted copy of a very long list. (It's an UNDOCUMENTED FORM for Table, too!!) Try this instead: shuffledList = #[[RandomPermutation@Length@#]] &[ReadList["gg.txt"]]; Short[%, 7] Here are a couple of trials where aa = Range[n] rather than coming from a file. Quit <<DiscreteMath`Combinatorica` n=10^7; Timing[aa=Range[n];origList=Table[aa];p=RandomPermutation[Length[origList]];\ shuffledList=origList[[p]];] MemoryInUse[] {12.203 Second,Null} 124950184 Quit <<DiscreteMath`Combinatorica` n=10^7; Timing[shuffledList=Range[n][[RandomPermutation[n]]];] MemoryInUse[] {12.328 Second,Null} 44948736 N[124950184/44948736] 2.77984 So your code uses almost three times as much memory. Sadly, however, BOTH codes run out of memory for n=10^8 on my machine. Bobby On Tue, 28 Dec 2004 06:30:22 -0500 (EST), George Szpiro <george at netvision.net.il> wrote: > Hi, > > I am trying to shuffle 10^8 numbers stored in the file GG.doc in the root > directory. (Size of GG.doc appros 360 MB) > > Accorrding to previous suggestions from this group I try to shuffle them > witht he following program: > > GG=OpenRead["c:\GG.doc"]; > AA=ReadList[GG]; > Timing[ > OrigList=Table[AA]; > p=RandomPermutation@Length@OrigList; > ShuffledList=OrigList[[p]]; > > > But the file is far too big. I can read it but then I get the following > error message: > > <<No more memory available. Mathematica kernel has shut down. Try quitting > other applications and then retry.>> > > No other programs are open, so I guess I am at the limit. Can anybody > suggest a workaround? Is there a possibility to shuffle numbers without > loading them all into memory simultaneously? > > NEW IDEA: I thought there might be a possibility of just reading one single > number each time from the file GG.doc, and putting them into a randomly > chosen slot in a new file. > > Any answeres greatly appreciated to: > george at netvision.net.il > > Thanks, > George > > > > > -- DrBob at bigfoot.com www.eclecticdreams.net
- References:
- shuffling 10^8 numbers
- From: George Szpiro <george@netvision.net.il>
- shuffling 10^8 numbers