MathGroup Archive 2000

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: ReadList question

  • To: mathgroup at smc.vnet.net
  • Subject: [mg23795] Re: [mg23731] ReadList question
  • From: "Mark Harder" <harderm at ucs.orst.edu>
  • Date: Sat, 10 Jun 2000 02:59:47 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

Ernie,
    I can't test this out right now, but it does not seem that a system with
the power you describe should take all night to read even a file of this
length.
    Right away, I notice some things in the filename in ReadList that look
suspicious.  First, I have learned (somehow) that punctuation marks,
particularly "\",  need to be doubled to be correctly interpreted inside a
string thusly "d:\\System Test Data\\........ ". Also, I'm not sure that
Windows knows how to interpret a directory name with a "." in it. (You
didn't say which op sys you were using, so I'm assuming some Win32 flavor.)
    I use ReadList[] to read ASCII files,too, and your method should work;
only, I use the SetDirectory[] command to create a working directory for the
entire session.  That way, several files can be read and written out of/
into a common project folder without specifying the directory every time.
    My suggestion is that you test opening the directory you want to see
whether you are doing that correctly. After creating a work9ing directory,
the Directory[] should output its name, as Mathematica understands it.  I
don't understand why your ReadList[] doesn't get any error messages, maybe
these will flush them out.
-mark harder


-----Original Message-----
From: Ernie <ermi at worldnet.att.net>
To: mathgroup at smc.vnet.net
Subject: [mg23795] [mg23731] ReadList question


>Hi,
> I have been using Mathematica 4.0 at work and it is really
>helpful. I am having trouble importing fairly large datafiles within a
>reasonable amount of time. The data file is about 50Mb of tab
>delimited text in table format with three header rows, which comes to
>420,000 lines or 550,000 words in size.
> I run PIII 500Mhz Dual Pentuim Unit w/384Mb ram and I have
>shut down all other process to accomplish these tasks. Yet no matter
>how I try this it takes entirely too long, as it runs overnight and
>still does not complete.
> Is there another routine (function) set I should be using? I
>need to read in the data as it is, cut off the header rows, and
>extract the columns with the numerical data in them for processing. I
>will be needing to process files of this size and larger in the
>future. Have any of you had experience with this sort of thing in
>Mathematica?
>
> The code I use is as such:
>***************************************************************************
><< Statistics`DataManipulation`
><< Statistics`DescriptiveStatistics`
>
>ldapResults =
>    ReadList["d:\System Test Data\Bug.Dirquery\Data.out", Word,
>      RecordLists -> True, WordSeparators -> {"\t"}];
>
>mydatasize = Length[ldapResults]
>
>resultsTable = Take[ldapResults, {4, mydatasize}];
>
>fulldatasize = Length[resultsTable]
>
>workingset = ToExpression[Column[resultsTable, {1, 3, 5, 8, 9}]]
>***************************************************************************
*
>
>Thanks all.
>
>-- Ernie
>



  • Prev by Date: Minimize number of mulitplications
  • Next by Date: Re: NDSolve error message: Can't find starting value ...
  • Previous by thread: ReadList question
  • Next by thread: Re: simple question....very simple