MathGroup Archive 1992

[Date Index] [Thread Index] [Author Index]

Search the Archive

Inputting integer data in binary form

  • To: mathgroup at yoda.physics.unc.edu
  • Subject: Inputting integer data in binary form
  • From: dmg at oceanus.mitre.org (David M Goblirsch)
  • Date: Wed, 20 May 92 07:47:23 EDT

   I'm working on a Sun system with Mathematica 1.2.  I need to input
   a file of integer data which I have in binary form, two bytes per
   integer.  Can anyone make any suggestions about how to do this?

   More generally, are there Mathematica functions or packages for
   inputting other types of binary files, containing, for example,
   four byte integers, four byte floating, eight byte floating,
   etc?  

I needed this too, and couldn't find a direct Mma function for doing it.
So I convert the files to ASCII and then read them using ReadList.
One way to do this is to convert each file and keep the ASCII files on
disk.  This doesn't work for me because I have hundreds of megabytes of
speech data, so I want to keep it in binary format.  So another way is to
create a Mma function which does the translation for you on the fly.  Here
is a function I use for reading binary short int files:


rdxbin[ fileId_String ] :=
     ReadList[ StringJoin[ "!od -vi ",
			   xpath[fileId],
			   ".bin", 
    	    	    	   " | awk '{$1 = \"\" ; print }'"
	       ], 
    	       Number ]

Using ReadList with a string beginning with a ! runs the specified program
(in this case the UNIX program "od") and reads the output from that
program through a pipe.  In the above function, this command is built from
pieces using StringJoin.  xpath[fileId] is another Mma function I wrote
which just returns the full path name of the file I want; fileId is a
string just good enough to identify the file relative to some home
directory.  "od" is the UNIX utility for printing binary files.  Do a man
on "od", and you'll see that -i is for interpreting the bytes as short
ints and -v says to show ALL the data even if it results in repeated
lines.  (Trust me on this, do NOT forget that v flag!!)  Problem: od
returns addresses in the first column, so you have to strip them off,
hence the step through awk.  Finally, the second argument to ReadList is
"Number" because, by now, we have a file containing ASCII numbers.

It probably would be more efficient to write a C program to do the
translation, and perhaps perl could be used instead of awk, but this
version works fast enough for my purposes.


David M. Goblirsch, The MITRE Corporation, McLean VA 22102
dmgob at mitre.org
(703) 883-5450



ions for processing data from UNIX commands,
programs, or files.

                                              Robby Villegas
                                              Knox College
                                              (Villegas at Knox.Bitnet)





  • Prev by Date: Finding effective directive of a graphics primitive
  • Next by Date: Mathematica conference schedule
  • Previous by thread: Re:Inputting integer data in binary form
  • Next by thread: Fortran output