Re: Importing data
- To: mathgroup at smc.vnet.net
- Subject: [mg113450] Re: Importing data
- From: "Hans Michel" <hmichel at cox.net>
- Date: Fri, 29 Oct 2010 06:28:14 -0400 (EDT)
David fileToProcessPath = "your path here"; processedFile = {}; processingFileToStream = StringToStream[Import[fileToProcessPath]]; Skip[processingFileToStream, Record, 1]; (* Do this if there is a header row to skip, or header area to skip add to the line number *) While[(eachLine = Read[processingFileToStream, String]) != "EndOfFile", AppendTo[ processedFile, {StringTake[eachLine, {1, 10}], StringTake[eachLine, {10, 20}], StringTake[eachLine, {20, 30}], StringTake[eachLine, {30, 40}], StringTake[eachLine, {40, 50}]}] ]; Close[processingFileToStream]; processedFile Further define your StringTake(s) to fit the schema of you text file. I like this process because you may use it as a filter to select rows you want or you can modify the code to insert each row into a (SQL-like) database. (One can do that external to Mathematica). You may also evaluate to expression of convert to mathematica formats as you process each line, for example convert a datetime field to a DateList[]. Each conversion adds to the time. I don't like the AppendTo method of building up a list but it is what it is. Hans -----Original Message----- From: David Higgins [mailto:dhi67540 at bigpond.net.au] Sent: Wednesday, October 27, 2010 4:19 AM To: mathgroup at smc.vnet.net Subject: [mg113450] [mg113415] Importing data Hi all, Relatively new user here. I have a text file containing 540,000 lines of data in 30 columns, fixed width. Not all columns are populated for all data lines. When I import, Mathematica creates a list and populates the list elements from 1 to the number of data elements it finds in each line, ignoring the empty columns so for data lines with only, say 20 columns populated, it has populated data elements 1 to 20 instead of skipping the missing ones. Is there a way to import data that allows me to define a template for the data elements so that the right data ends up in the right elements in the list? Cheers David