MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: importing fixed width data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg97541] Re: importing fixed width data
  • From: Bob F <deepyogurt at gmail.com>
  • Date: Sat, 14 Mar 2009 18:17:24 -0500 (EST)
  • References: <gpg1lt$cps$1@smc.vnet.net>

On Mar 14, 4:42 am, swiftset <swift... at gmail.com> wrote:

> Hi, I'm trying to import data from a file where the fields are stored
> in a fixed width format. The Import["file", "Table"] command is the
> closest I can get to what I'd like, but then the data is divided using
> whitespace, so I get a lot of extraneous fields, and no way to tell
> whether what Mathematica imported as two fields is actually just one
> that contained a whitespace.
>
> Any ideas on how to proceed?

Is the problem because some of your fixed-width data fields are blank
and you would like Mathematica to import a "blank" when the field is
really blank, instead of thinking this "line" has fewer number of
fields?

Well, if you are on a Mac or Linux/Unix system you could edit the file
first in a command shell window with the vi editor and change sets of
continuous spaces to a comma (Mathematica will take a comma separated
set of data very readily).

I assume this is either a mixture of numeric and alphanumeric data or
just alphanumeric data - which is it. So, for example using the vi
editor on a file of data that has a fixed field width of 10, and some
of the fields are all blank (and possibly several in a row) and the
rest of the fields only contain numbers, first need to add a comma at
the end of all non-blank-but-numeric fields with the vi command

   :%s/\([0-9][0-9]*\)/\1,/g

then need to change fields that are all blank (assuming 10 blanks in a
row) with the vi command

   :%s/          /,/g

which would change every set of 10 blanks to a comma to indicate an
empty field. But then need to remove the comma at the end of all lines
(otherwise will end up with one extra field on each line when you
import into Mathematica) with the vi command

   :%s/,$//

Then you could import into Mathematica and it should figure it out and
put null items in the list so that things come out in the correct
order. If you are on Windows, you can download a Windows version of
the "vi" program called "vim" from http://www.vim.org/download.php and
do the same thing. Make sure to edit on a copy of the file in case you
make a mistake.

Don't mean to be so confusing with the weird syntax of the vi
commands, but if you are not familiar with using wild cards and saved
patterns in vi, you might want to consult your closest vi expert or
look at a book or the man pages.

Hope that helps. If not, post a few typical lines from your data file
and I will try and make another suggestion depending on it's
structure. If the data is sensitive or confidential, just dummy up
something that exhibits the same exact pattern (spaces, tabs, special
characters like quotes or others are important so include them in your
example).

For example if the data file looked like this (note that some lines
have blanks at the end and some don't but all have 7 10-wide-fields so
want to make sure that Mathematica ends up with a lists that are each
7 elements in length) and all fields are right-justified:

       111                            44                  66
       211                            44                  66
       311                            44                  66
       411                            44                  66        77

would get changed to this:

       111,,,        44,,        66,
       211,,,        44,,        66,
       311,,,        44,,        66,
       411,,,        44,,        66,        77

and the first, second and third lines that appear to have 3 values to
Mathematica using spaces as a filler in blank fields, in the comma
version look to have the actual 7 values (but some of them are null or
blank) that all lines do.

Just thought of another thing you could do - use Excel (the
spreadsheet program) to import the data, and it has the ability to
parse the data assuming a fixed field length, then export as a comma
separated file and then import directly into Mathematica. See the
Excel manual or help for an explanation of how to import and parse
data.

-Bob


  • Prev by Date: Re: Re: Re: Mathematica 7.0.1.0 and some General
  • Next by Date: Re: How to map a list on function
  • Previous by thread: Re: importing fixed width data
  • Next by thread: Re: importing fixed width data