RE: Large-scale enumerations and memory problems
- To: mathgroup at smc.vnet.net
- Subject: [mg35125] RE: [mg35084] Large-scale enumerations and memory problems
- From: "DrBob" <majort at cox-internet.com>
- Date: Tue, 25 Jun 2002 03:42:47 -0400 (EDT)
- Reply-to: <drbob at bigfoot.com>
- Sender: owner-wri-mathgroup at wolfram.com
The following solution uses only about half as much memory as yours (I
changed the storage method):
Enum[lst_List, (k_Integer)?Positive] :=
Block[{minusone = Enum[lst, k - 1], n = Length[lst]},
Join @@ Table[(lst[[i]] <> #1 &) /@ minusone, {i, n}]]
Enum[lst_List, 0] := {""}
seq = {"G", "A", "T", "C"};
Enum[seq, 3]
For more algorithm ideas, look at the Stamps solution at
http://www.telospub.com/journal/MIER/Piele/Vol7No3/Piele73.html
Bobby Treat
-----Original Message-----
From: Mark E. Berres [mailto:markb at ravel.zoology.wisc.edu]
To: mathgroup at smc.vnet.net
Subject: [mg35125] [mg35084] Large-scale enumerations and memory problems
Hi,
I am performing some enumerations (v4.1 under Win-XP with 768 MB
physical memory and
4GB swap space) with the following simple code:
Enum[l_List, (k_Integer)?Positive] := Block[{minusone = Enum[l, k - 1],
i,
n = Length[l]}, Join @@ Table[(Prepend[#1, l[[i]]] & ) /@ minusone,
{i,
n}]].
What this does is to take a supplied list of objects, l, and generate
all
possible combinations of
these objects in a string of length k. Thus, even for small l and
moderate
values of k, the number
of combinations becomes very large.
I am exporting the results to a file by
Export["file.txt", Enum[{G , A , T, C}, 6], "List"] (I want one entry
per
line of text)
but when k becomes "too large", for example if l=4
and k=14, 4^14 = ~268 million, the kernal eventually crashes complaining
about memory. I estimated
that approx. 1.61 GB is required to store all strings in memory before
writing to disk - plenty of room
available on my machine, but still it crashes. It appears that the
calculation first does its thing and then
when initiating a write to disk, crashes. Is there a way to write the
results one-at-a-time, rather than
the whole-thing-at-once? Or does my problem lie elsewhere?
Sincerely,
Mark