Reading binary files
- To: mathgroup at smc.vnet.net
- Subject: [mg14652] Reading binary files
- From: jenningsj at mail.utexas.edu (Jim Jennings)
- Date: Sat, 7 Nov 1998 02:10:01 -0500
- Organization: University of Texas at Austin
- Sender: owner-wri-mathgroup at wolfram.com
I recently obtained the MathLink applications MathHDF and FastBinary
from MathSource in my quest to get large data files into Mathematica
faster. Both packages needed updating; they did not contain PowerMac
native applications, and MathHDF did not work at all on my system:
104MB ram, virtual set to 105MB
With some tinkering I was able to recompile both packages into PowerMac
native applications using the most recent Mathlink goodies (the
developers kit that came on the Mathematica CD and updated mathlink.h
and SAmprep downloaded from the Wolfram Web site) and CodeWarrior Pro 2
(CodeWarrior IDE 2.1). The new MathHDF uses HDF 4.1r1. The new
MathHDF has been submitted to MathSource; the new FastBinary will be
Here are the results of a simple benchmark reading a large array into
Mathematica using various methods on the computer described above:
ReadListBinary FastBinary (ppc) 23 seconds
ReadSDS MathHDF (ppc) 28 seconds
ReadListBinary FastBinary (68k) 189 seconds
ReadList built in function reading ascii text 939 seconds
ReadListBinary standard package Utilities`BinaryFiles` 9597 seconds
The times are elapsed times, not CPU time. I was careful to not do
anything else with the computer while the benchmarks were running. The
files contained a 21 by 30 by 300 array of 4 byte real numbers. The
result for ReadListBinary from the package Utilities`BinaryFiles` is
actually an estimate; a single 30 by 300 slice of the array was read &
the result multiplied by 21.
ReadSDS read from a 741K HDF file containing the 3D array.
ReadListBinary read from a 741K binary file containing 189,000 numbers
in a single list (except for the Utilities`BinaryFiles` test which read
from a file with 9,000 numbers). ReadList read a 2.5MB ascii text file
with 189,000 numbers. The ascii text file was created with
Mathematica; the numbers ranged from 2 to 18 characters long.
The ppc native ReadListBinary and ReadSDS read the file in about the
same time. This is satisfying, but it still seems somewhat slow to me
for files of this size on this computer. Can anyone explain it?
The ppc native ReadListBinary has a factor of 8 advantage over the 68k
version, which makes sense since the 68k version was running in
emulation. It looks like it will be worth the trouble to submit the
updated FastBinary to MathSource.
The ppc native ReadListBinary has a factor of 41 advantage over
ReadList. For those of you wanting to read large files it will be well
worth the trouble to convert your files to binary (or better, make them
that way in the first place) and use FastBinary or MathHDF.
The result for ReadListBinary from Utilities`BinaryFiles` is outrageous!
Did I do something wrong? Anyone attempting to get faster reads by
converting their ascii files to binary & using this solution will get a
nasty surprise; it will be more than a factor of 10 slower! Can anyone
explain this seemingly absurd behavior?
Jim Jennings Research Associate
jenningsj at mail.utexas.edu Bureau of Economic Geology (512)
471-4364 (voice) University of Texas at Austin (512) 471-0140 (fax)
Prev by Date:
Pb with finding a residue
Next by Date:
Re: Abs and derivative problems
Previous by thread:
Re: Pb with finding a residue
Next by thread:
How to transpose vector?