Re: Importing data
- To: mathgroup at smc.vnet.net
- Subject: [mg113450] Re: Importing data
- From: "Hans Michel" <hmichel at cox.net>
- Date: Fri, 29 Oct 2010 06:28:14 -0400 (EDT)
David
fileToProcessPath = "your path here";
processedFile = {};
processingFileToStream = StringToStream[Import[fileToProcessPath]];
Skip[processingFileToStream, Record, 1]; (* Do this if there is a header
row to skip, or header area to skip add to the line number *)
While[(eachLine = Read[processingFileToStream, String]) != "EndOfFile",
AppendTo[
processedFile, {StringTake[eachLine, {1, 10}],
StringTake[eachLine, {10, 20}], StringTake[eachLine, {20, 30}],
StringTake[eachLine, {30, 40}], StringTake[eachLine, {40, 50}]}]
];
Close[processingFileToStream];
processedFile
Further define your StringTake(s) to fit the schema of you text file.
I like this process because you may use it as a filter to select rows you
want or you can modify the code to insert each row into a (SQL-like)
database. (One can do that external to Mathematica). You may also evaluate
to expression of convert to mathematica formats as you process each line,
for example convert a datetime field to a DateList[]. Each conversion adds
to the time.
I don't like the AppendTo method of building up a list but it is what it is.
Hans
-----Original Message-----
From: David Higgins [mailto:dhi67540 at bigpond.net.au]
Sent: Wednesday, October 27, 2010 4:19 AM
To: mathgroup at smc.vnet.net
Subject: [mg113450] [mg113415] Importing data
Hi all,
Relatively new user here. I have a text file containing 540,000 lines of
data in 30 columns, fixed width. Not all columns are populated for all data
lines.
When I import, Mathematica creates a list and populates the list elements
from 1 to the number of data elements it finds in each line, ignoring the
empty columns so for data lines with only, say 20 columns populated, it has
populated data elements 1 to 20 instead of skipping the missing ones. Is
there a way to import data that allows me to define a template for the data
elements so that the right data ends up in the right elements in the list?
Cheers
David