RE: Large-scale enumerations and memory problems
- To: mathgroup at smc.vnet.net
- Subject: [mg35119] RE: [mg35084] Large-scale enumerations and memory problems
- From: "DrBob" <majort at cox-internet.com>
- Date: Tue, 25 Jun 2002 03:42:33 -0400 (EDT)
- Reply-to: <drbob at bigfoot.com>
- Sender: owner-wri-mathgroup at wolfram.com
Mathematica apparently uses more space than you think!
After making a guess as to the rest of Enum's definition, I estimated
byte-counts for the problem as follows:
Enum[lst_List, (k_Integer)?Positive] :=
Block[{minusone = Enum[lst, k - 1], i, n = Length[lst]},
Join @@ Table[(Prepend[#1, lst[[i]]] &) /@ minusone, {i, n}]]
Enum[lst_List, 0] := {{}}
S s = N@ByteCount@Enum[{G, A, T, C}, #]/(4^#) & /@ Range[11]
{32., 29.5, 36.375, 36.0938, 44.0234, 44.0059, 52.0015, 52.0004,
60.0001, \
60., 68.}
poly = Fit[Transpose[{Range[11],
s}], {1, x, x^2}, x]
27.48417380361847 + 2.244134349422857*x + 0.12561505768960443*x^2
soln = Flatten[Solve[bytes/4^x == poly, bytes]]
{bytes -> 4.^x*(27.48417380361847 + 2.244134349422857*x +
0.12561505768960443*x^2)}
bytes /. soln /. x -> 14
2.2420428828643364*^10
%/(2^30)
20.8807
So... when k=14, I think around 20GB is required, not 1.61GB.
That's bigger than your swap space, and it gets worse quickly:
(bytes GB/(2^30) /. soln /. x -> #) & /@ Range[13, 20]
{4.86793 GB, 20.8807 GB, 89.4096 GB, 382.191 GB, 1631. GB, 6948.98 GB,
29560.3 GB, 125556. GB}
This is only ONE way of extrapolating the data, of course, and it is
EXTRAPOLATION and hence inherently suspect.
Bobby Treat
-----Original Message-----
From: Mark E. Berres [mailto:markb at ravel.zoology.wisc.edu]
To: mathgroup at smc.vnet.net
Subject: [mg35119] [mg35084] Large-scale enumerations and memory problems
Hi,
I am performing some enumerations (v4.1 under Win-XP with 768 MB
physical memory and
4GB swap space) with the following simple code:
Enum[l_List, (k_Integer)?Positive] := Block[{minusone = Enum[l, k - 1],
i,
n = Length[l]}, Join @@ Table[(Prepend[#1, l[[i]]] & ) /@ minusone,
{i,
n}]].
What this does is to take a supplied list of objects, l, and generate
all
possible combinations of
these objects in a string of length k. Thus, even for small l and
moderate
values of k, the number
of combinations becomes very large.
I am exporting the results to a file by
Export["file.txt", Enum[{G , A , T, C}, 6], "List"] (I want one entry
per
line of text)
but when k becomes "too large", for example if l=4
and k=14, 4^14 = ~268 million, the kernal eventually crashes complaining
about memory. I estimated
that approx. 1.61 GB is required to store all strings in memory before
writing to disk - plenty of room
available on my machine, but still it crashes. It appears that the
calculation first does its thing and then
when initiating a write to disk, crashes. Is there a way to write the
results one-at-a-time, rather than
the whole-thing-at-once? Or does my problem lie elsewhere?
Sincerely,
Mark