Re: convert txt to binary file
- To: mathgroup at smc.vnet.net
- Subject: [mg108971] Re: convert txt to binary file
- From: Albert Retey <awnl at gmx-topmail.de>
- Date: Thu, 8 Apr 2010 08:04:21 -0400 (EDT)
- References: <hphbq3$lma$1@smc.vnet.net>
Hi, > My collegue has done something on another system. You think I could > run it on mathematica even if the functions are different from a > software to the other? No, you of course can not run the code that was written for the other system. But if he converted the text file to a binary file with the other system, then you probably can import the binary file he generated directly, if the format is supported. Since the text representation of a floating point number is usually much longer than the binary representation of a double, my guess is that on your 64bit system the import of the binary file would just work. If you want/need to import the text file directly, I think reading larger junks with Open/ReadList/Close, converting each junk to arrays of numbers (probably ensuring they are stored as PackedArrays with Developer`ToPackedArray) and joining all junks in the end will have a good chance to work. Maybe you can also start with counting the line numbers, then preallocate a list to hold the data with data=Table[0.,{numberoflines},{3}], then read the data and fill the array data will be the most memory efficient way to import the large file. It will then take a little experimentation with the junk size to make the code fast... > Here is a sample of the data: -0.13923438E+00 -0.22521242E+00 > 0.10765536E+01 -0.13928019E+00 -0.22522102E+00 0.10765295E+01 > -0.13934083E+00 -0.22523038E+00 0.10765673E+01 -0.13940084E+00 > -0.22523966E+00 0.10766749E+01 -0.13944457E+00 -0.22524883E+00 > 0.10768325E+01 -0.13946098E+00 -0.22525747E+00 0.10769989E+01 > -0.13944666E+00 -0.22526383E+00 0.10771308E+01 -0.13940556E+00 > -0.22526550E+00 0.10771986E+01 -0.13934693E+00 -0.22526184E+00 > 0.10771959E+01 -0.13928294E+00 -0.22525565E+00 0.10771388E+01 > -0.13922668E+00 -0.22525267E+00 0.10770591E+01 -0.13919051E+00 > -0.22525826E+00 0.10769958E+01 -0.13918413E+00 -0.22527380E+00 > 0.10769836E+01 -0.13921118E+00 -0.22529592E+00 0.10770431E+01 > -0.13926551E+00 -0.22531886E+00 0.10771747E+01 -0.13933089E+00 > -0.22533810E+00 0.10773579E+01 > Are there line break after every third number? If yes, for this data the following would work and should be rather memory efficient, but probably slow. For decent speed you will certainly need a lot larger values for junksize: fname = ToFileName[{$HomeDirectory, "Desktop"}, "data.txt"]; junksize = 3; stream = OpenRead[fname]; lines = {1}; numlines = 0; While[lines =!= {}, lines = ReadList[stream, {Number, Number, Number}, junksize]; numlines += Length[lines]; ]; Close[stream]; Print[numlines]; stream = OpenRead[fname]; lines = {1}; numline = 1; data = Developer`ToPackedArray[Table[0., {numlines}, {3}]]; While[lines =!= {}, lines = ReadList[stream, {Number, Number, Number}, junksize]; Do[ data[[numline + i - 1, k]] = lines[[i, k]], {i, Length[lines]}, {k, 3} ]; numline += Length[lines]; ]; Close[stream]; ByteCount[data] Developer`PackedArrayQ[data] hth, albert