Re: IO-Performance of Mathematica 4.1
- To: mathgroup at smc.vnet.net
- Subject: [mg44781] Re: IO-Performance of Mathematica 4.1
- From: "Christos Argyropoulos M.D." <argchris at otenet.gr>
- Date: Thu, 27 Nov 2003 11:38:06 -0500 (EST)
- References: <bphtib$1lu$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Hello, Ran into the same problem with really big datasets (Iam using Mathematica to analyze measurements from "DNA-chips", which tend to be huge files). One can speed mathematica's I/O by Importing numerical data as "Table" and setting ConversiongOptions->{DataBlockErrorChecking" ->False} Christos Argyropoulos MD Patras Greece ----- Original Message ----- From: "Marko Kastens" <Kastens at Hamburg.baw.de> To: mathgroup at smc.vnet.net Subject: [mg44781] IO-Performance of Mathematica 4.1 > Hi! > > I'am searching for a good (fast) way to import big datasets in Mathematica 4.1. > > What I've done so far: > - The read-in from the _orignal_ (ASCII)file needs 97 sec. (a 5xxxx x 2 > matrix) > - Exporting my matrix using Export[] with "List" or "Table" needs > (nearly) infinity of time :-( > - Exporting the matrix with the HDF (Hierarchy Data Format): 7 seconds > :-) Great, but... > - Importing the matrix with HDF: fast but wrong: the HDF seems to > supports only Real64. My first column is a 10-digit-Integer representing > the MM-time-format. By converting this to yyExx the last digits get lost > and my time is doing funny jumps... > > - using the Experimental-Functions like this: > > << Experimental` > t0 = AbsoluteTime[]; > strm = OpenWrite["tmp0.bin", DOSTextFormat -> False]; > BinaryExport[strm, Length[data], "Integer32"] > For[i = 1, i < Length[data] + 1, > { > BinaryExport[strm, data[[i]], {"Integer64", "Real64", "Integer64"}] > }; i++] > Close[strm]; > Print[AbsoluteTime[] - t0]; > > it takes 415 seconds to export and 247 seconds to import. Even worser > than importing the original. > > Only the good old ASCII-way is better: > > t0 = AbsoluteTime[]; > strm = OpenWrite["tmp1.dat"]; > Write[strm, Length[data]] > For[i = 1, i < Length[data] + 1, > { > Write[strm, data[[i, 1]]], > Write[strm, data[[i, 2]]], > Write[strm, data[[i, 3]]] > }; i++] > Close[strm]; > Print[AbsoluteTime[] - t0]; > > It takes 32 seconds to export and 16 seconds to import. > > > Hmm, normaly binary-import/export is faster than ascii-import/export. > Does anybody know a proper and fast method? Thanks a lot. > > marko >