RE: Large-scale enumerations and memory problems

*To*: mathgroup at smc.vnet.net*Subject*: [mg35119] RE: [mg35084] Large-scale enumerations and memory problems*From*: "DrBob" <majort at cox-internet.com>*Date*: Tue, 25 Jun 2002 03:42:33 -0400 (EDT)*Reply-to*: <drbob at bigfoot.com>*Sender*: owner-wri-mathgroup at wolfram.com

Mathematica apparently uses more space than you think! After making a guess as to the rest of Enum's definition, I estimated byte-counts for the problem as follows: Enum[lst_List, (k_Integer)?Positive] := Block[{minusone = Enum[lst, k - 1], i, n = Length[lst]}, Join @@ Table[(Prepend[#1, lst[[i]]] &) /@ minusone, {i, n}]] Enum[lst_List, 0] := {{}} S s = N@ByteCount@Enum[{G, A, T, C}, #]/(4^#) & /@ Range[11] {32., 29.5, 36.375, 36.0938, 44.0234, 44.0059, 52.0015, 52.0004, 60.0001, \ 60., 68.} poly = Fit[Transpose[{Range[11], s}], {1, x, x^2}, x] 27.48417380361847 + 2.244134349422857*x + 0.12561505768960443*x^2 soln = Flatten[Solve[bytes/4^x == poly, bytes]] {bytes -> 4.^x*(27.48417380361847 + 2.244134349422857*x + 0.12561505768960443*x^2)} bytes /. soln /. x -> 14 2.2420428828643364*^10 %/(2^30) 20.8807 So... when k=14, I think around 20GB is required, not 1.61GB. That's bigger than your swap space, and it gets worse quickly: (bytes GB/(2^30) /. soln /. x -> #) & /@ Range[13, 20] {4.86793 GB, 20.8807 GB, 89.4096 GB, 382.191 GB, 1631. GB, 6948.98 GB, 29560.3 GB, 125556. GB} This is only ONE way of extrapolating the data, of course, and it is EXTRAPOLATION and hence inherently suspect. Bobby Treat -----Original Message----- From: Mark E. Berres [mailto:markb at ravel.zoology.wisc.edu] To: mathgroup at smc.vnet.net Subject: [mg35119] [mg35084] Large-scale enumerations and memory problems Hi, I am performing some enumerations (v4.1 under Win-XP with 768 MB physical memory and 4GB swap space) with the following simple code: Enum[l_List, (k_Integer)?Positive] := Block[{minusone = Enum[l, k - 1], i, n = Length[l]}, Join @@ Table[(Prepend[#1, l[[i]]] & ) /@ minusone, {i, n}]]. What this does is to take a supplied list of objects, l, and generate all possible combinations of these objects in a string of length k. Thus, even for small l and moderate values of k, the number of combinations becomes very large. I am exporting the results to a file by Export["file.txt", Enum[{G , A , T, C}, 6], "List"] (I want one entry per line of text) but when k becomes "too large", for example if l=4 and k=14, 4^14 = ~268 million, the kernal eventually crashes complaining about memory. I estimated that approx. 1.61 GB is required to store all strings in memory before writing to disk - plenty of room available on my machine, but still it crashes. It appears that the calculation first does its thing and then when initiating a write to disk, crashes. Is there a way to write the results one-at-a-time, rather than the whole-thing-at-once? Or does my problem lie elsewhere? Sincerely, Mark