Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Importing data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg113450] Re: Importing data
  • From: "Hans Michel" <hmichel at cox.net>
  • Date: Fri, 29 Oct 2010 06:28:14 -0400 (EDT)

David

fileToProcessPath = "your path here";
processedFile = {};

 processingFileToStream = StringToStream[Import[fileToProcessPath]];
 
 Skip[processingFileToStream, Record, 1]; (* Do this if there is a header
row to skip, or header area to skip add to the line number *)
 
 While[(eachLine = Read[processingFileToStream, String]) != "EndOfFile", 
  AppendTo[
   processedFile, {StringTake[eachLine, {1, 10}], 
    StringTake[eachLine, {10, 20}], StringTake[eachLine, {20, 30}], 
    StringTake[eachLine, {30, 40}], StringTake[eachLine, {40, 50}]}]
  ];
 
 Close[processingFileToStream];

processedFile

Further define your StringTake(s) to fit the schema of you text file.

I like this process because you may use it as a filter to select rows you
want or you can modify the code to insert each row into a (SQL-like)
database. (One can do that external to Mathematica). You may also evaluate
to expression of convert to mathematica formats as you process each line,
for example convert a datetime field to a DateList[]. Each conversion adds
to the time.

I don't like the AppendTo method of building up a list but it is what it is.

Hans
-----Original Message-----
From: David Higgins [mailto:dhi67540 at bigpond.net.au] 
Sent: Wednesday, October 27, 2010 4:19 AM
To: mathgroup at smc.vnet.net
Subject: [mg113450] [mg113415] Importing data

Hi all,

Relatively new user here.  I have a text file containing 540,000 lines of
data in 30 columns, fixed width.  Not all columns are populated for all data
lines.

When I import, Mathematica creates a list and populates the list elements
from 1 to the number of data elements it finds in each line, ignoring the
empty columns so for data lines with only, say 20 columns populated, it has
populated data elements 1 to 20 instead of skipping the missing ones.  Is
there a way to import data that allows me to define a template for the data
elements so that the right data ends up in the right elements in the list?

Cheers

David





  • Prev by Date: Re: Assertions in Mathematica?
  • Next by Date: Re: Manually nested Tables faster than builtin
  • Previous by thread: Re: Importing data
  • Next by thread: Re: Importing data