MathGroup Archive 2004

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: shuffling 10^8 numbers

  • To: mathgroup at smc.vnet.net
  • Subject: [mg53207] Re: [mg53180] shuffling 10^8 numbers
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Tue, 28 Dec 2004 23:12:45 -0500 (EST)
  • References: <200412281130.GAA26970@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

George Szpiro wrote:
> Hi,
> 
> I am trying to shuffle 10^8 numbers stored in the file GG.doc in the root 
> directory. (Size of GG.doc appros 360 MB)
> 
> Accorrding to previous suggestions from this group I try to shuffle them 
> witht he following program:
> 
> GG=OpenRead["c:\GG.doc"];
> AA=ReadList[GG];
>   Timing[
>   OrigList=Table[AA];
>   p=RandomPermutation@Length@OrigList;
>   ShuffledList=OrigList[[p]];
> 
> 
> But the file is far too big. I can read it but then I get the following 
> error message:
> 
> <<No more memory available. Mathematica kernel has shut down. Try quitting 
> other applications and then retry.>>
> 
> No other programs are open, so I guess I am at the limit. Can anybody 
> suggest a workaround? Is there a possibility to shuffle numbers without 
> loading them all into memory simultaneously?
> 
> NEW IDEA: I thought there might be a possibility of just reading one single 
> number each time from the file GG.doc, and putting them into a randomly 
> chosen slot in a new file.
> 
> Any answeres greatly appreciated to:
> george at netvision.net.il
> 
> Thanks,
> George

On many systems the maximum memory Mathematica can use is about 2 Gb. 
One can certainly form a random permutation of size 10^8 keeping inside 
this memory limit.

shuffleC = Compile[{{n, _Integer}}, Module[
   {res = Range[n], tmp, rand},
   Do[
     rand = Random[Integer, {j, n}];
     tmp = res[[j]];
     res[[j]] = res[[rand]];
     res[[rand]] = tmp,
     {j, 1, n}];
   res
   ]];

In[2]:= MaxMemoryUsed[]
Out[2]= 3310600

In[3]:= Timing[shuf8 = shuffleC[10^8];]
Out[3]= {114.35 Second, Null}

In[4]:= MaxMemoryUsed[]
Out[4]= 403282424

As expected the permutation takes around 4*10^8 bytes.

Now let's form an array of 10^8 random machine reals.

In[5]:= ll = Table[Random[],{10^8}];

Not surprisingly this takes up another 8*10^8 bytes.

In[6]:= MaxMemoryUsed[]
Out[6]= 1203284792

In forming the permuted array we will require another 8*10^8 bytes, 
putting us up against that limit.

It may be the case that you are on a system that can use more memory in 
Mathematica, or that your 10^8 elements do not take up as much storage 
as ll above (e.g. if there are many repeats, or if it is an array of 
machine integers). All the same it appears that you are likely to be 
near a memory limitation if, say, there is another large list lurking 
somewhere.


Daniel Lichtblau
Wolfram Research




  • Prev by Date: Re: shuffling 10^8 numbers
  • Next by Date: Re: shuffling 10^8 numbers
  • Previous by thread: Re: shuffling 10^8 numbers
  • Next by thread: Re: shuffling 10^8 numbers