ReadList question
- To: mathgroup at smc.vnet.net
- Subject: [mg23731] ReadList question
- From: ermi at worldnet.att.net (Ernie)
- Date: Mon, 5 Jun 2000 01:09:23 -0400 (EDT)
- Organization: Cisco Systems Inc
- Sender: owner-wri-mathgroup at wolfram.com
Hi, I have been using Mathematica 4.0 at work and it is really helpful. I am having trouble importing fairly large datafiles within a reasonable amount of time. The data file is about 50Mb of tab delimited text in table format with three header rows, which comes to 420,000 lines or 550,000 words in size. I run PIII 500Mhz Dual Pentuim Unit w/384Mb ram and I have shut down all other process to accomplish these tasks. Yet no matter how I try this it takes entirely too long, as it runs overnight and still does not complete. Is there another routine (function) set I should be using? I need to read in the data as it is, cut off the header rows, and extract the columns with the numerical data in them for processing. I will be needing to process files of this size and larger in the future. Have any of you had experience with this sort of thing in Mathematica? The code I use is as such: *************************************************************************** << Statistics`DataManipulation` << Statistics`DescriptiveStatistics` ldapResults = ReadList["d:\System Test Data\Bug.Dirquery\Data.out", Word, RecordLists -> True, WordSeparators -> {"\t"}]; mydatasize = Length[ldapResults] resultsTable = Take[ldapResults, {4, mydatasize}]; fulldatasize = Length[resultsTable] workingset = ToExpression[Column[resultsTable, {1, 3, 5, 8, 9}]] **************************************************************************** Thanks all. -- Ernie