Re: ReadList question
- To: mathgroup at smc.vnet.net
- Subject: [mg23795] Re: [mg23731] ReadList question
- From: "Mark Harder" <harderm at ucs.orst.edu>
- Date: Sat, 10 Jun 2000 02:59:47 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
Ernie, I can't test this out right now, but it does not seem that a system with the power you describe should take all night to read even a file of this length. Right away, I notice some things in the filename in ReadList that look suspicious. First, I have learned (somehow) that punctuation marks, particularly "\", need to be doubled to be correctly interpreted inside a string thusly "d:\\System Test Data\\........ ". Also, I'm not sure that Windows knows how to interpret a directory name with a "." in it. (You didn't say which op sys you were using, so I'm assuming some Win32 flavor.) I use ReadList[] to read ASCII files,too, and your method should work; only, I use the SetDirectory[] command to create a working directory for the entire session. That way, several files can be read and written out of/ into a common project folder without specifying the directory every time. My suggestion is that you test opening the directory you want to see whether you are doing that correctly. After creating a work9ing directory, the Directory[] should output its name, as Mathematica understands it. I don't understand why your ReadList[] doesn't get any error messages, maybe these will flush them out. -mark harder -----Original Message----- From: Ernie <ermi at worldnet.att.net> To: mathgroup at smc.vnet.net Subject: [mg23795] [mg23731] ReadList question >Hi, > I have been using Mathematica 4.0 at work and it is really >helpful. I am having trouble importing fairly large datafiles within a >reasonable amount of time. The data file is about 50Mb of tab >delimited text in table format with three header rows, which comes to >420,000 lines or 550,000 words in size. > I run PIII 500Mhz Dual Pentuim Unit w/384Mb ram and I have >shut down all other process to accomplish these tasks. Yet no matter >how I try this it takes entirely too long, as it runs overnight and >still does not complete. > Is there another routine (function) set I should be using? I >need to read in the data as it is, cut off the header rows, and >extract the columns with the numerical data in them for processing. I >will be needing to process files of this size and larger in the >future. Have any of you had experience with this sort of thing in >Mathematica? > > The code I use is as such: >*************************************************************************** ><< Statistics`DataManipulation` ><< Statistics`DescriptiveStatistics` > >ldapResults = > ReadList["d:\System Test Data\Bug.Dirquery\Data.out", Word, > RecordLists -> True, WordSeparators -> {"\t"}]; > >mydatasize = Length[ldapResults] > >resultsTable = Take[ldapResults, {4, mydatasize}]; > >fulldatasize = Length[resultsTable] > >workingset = ToExpression[Column[resultsTable, {1, 3, 5, 8, 9}]] >*************************************************************************** * > >Thanks all. > >-- Ernie >