MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Using ReadList to read a string

  • To: mathgroup at smc.vnet.net
  • Subject: [mg83818] Re: [mg83781] Using ReadList to read a string
  • From: "Thomas Dowling" <thomasgdowling at gmail.com>
  • Date: Sat, 1 Dec 2007 05:46:35 -0500 (EST)
  • References: <200711301023.FAA06237@smc.vnet.net>

Hello,

I need to update my previous post.

1.  It is, in fact, quite easy to import a comma-delimited file in the form
{Number, Number}.  The following
command, for example, solves the problem outlined in the previous post (how
to input a file containing
{x, time} data where the data are comma-delimited):

Partition[ReadList["/datacom.txt", Number, RecordSeparators -> {","}],
  2]


and gives the following output

{{1.24, 0.00161925}, {1.25, 0.00162431}, {1.26, 0.00161994}, {1.27,
  0.00161719}, {1.28, 0.00161219}, {1.29, 0.00160894}, {1.3,
  0.00161663}, {1.31, 0.00161956}, {1.32, 0.00162194}, {1.33,
  0.00161781}, {1.34, 0.001615}, {1.35, 0.00160962}, {1.36,
  0.00161806}, {1.37, 0.00162575}, {1.38, 0.00162256}, {1.39,
  0.00161581}, {1.4, 0.00161575}, {1.41, 0.00160694}, {1.42,
  0.00161869}, {1.43, 0.00161644}, {1.44, 0.00162231}, {1.45,
  0.00161681}, {1.46, 0.00161812}, {1.47, 0.00160969}, {1.48,
  0.00161875}, {1.49, 0.00162512}, {1.5, 0.00162319}, {1.51,
  0.0016135}, {1.52, 0.00161856}, {1.53, 0.00161231}, {1.54,
  0.00161887}}


2.  The above will not solve Don's problem, but if the hitch is due to
comma-delimited data the
following might work:


list4 = Partition[
  ToExpression[
   StringReplace[
    ReadList["/EWZ2.txt", Record, RecordSeparators -> {","}],
    Whitespace -> ""]], 7]

gives the following output:


{{20000714, "iSharesMSCIBrazilIndex", EWZ, 250, 1627, 1637,
  1627}, {163720000717, "iSharesMSCIBrazilIndex", EWZ, 100, 1730,
  1735, 1730}, {173520000718, "iSharesMSCIBrazilIndex", EWZ, 100,
  1730, 1730, 1730}, {173020000719, "iSharesMSCIBrazilIndex", EWZ,
  100, 1686, 1686, 1686}, {168620000720, "iSharesMSCIBrazilIndex",
  EWZ, 50, 1724, 1724, 1724}}

AND

Map[Head, list4, {2}]

gives the following output :

{{Integer, String, Symbol, Integer, Integer, Integer,
  Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
  Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
  Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
  Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
  Integer}}

I am assuming that EWZ2.txt is comma-delimited.

3.  There is an excellent tutorial 'How do I read comma-delimited numbers
into Mathematica?'

available at:


http://support.wolfram.com/mathematica/kernel/files/csv3.html



 I suppose there is a lesson for me in there somewhere!

I'd be interested to know if the above does the job, and in any other
suggestions.

Tom Dowling



On Nov 30, 2007 10:23 AM, Donald DuBois <donabc at comcast.net> wrote:

> Hello,
>
> I am trying to get ReadList to read a string in a text file  (filename.txt
> ).
>
> I would like NOT to have use Import because it is MUCH slower in reading
> a text file than ReadList is.  For example:
>
> (1) a file with 50,000 records can be created
> (2) Exported  to disk
> (3) read by ReadList[...] and
> (4) read by Import[...]
>
>
> dataFile1 =
>  Table[{2001, "nameA", "symbolA",
>   15.5}, {50000}]; Export["out1.txt", dataFile1, "Table"];
>
> AbsoluteTiming[
>  out1ReadList = ReadList["out1.txt", {Number, Word, Word, Number}];]
>
> AbsoluteTiming[out1Import = Import["out1.txt", "Table"];]
>
> {0.1718750, Null}
>
> {2.4375000, Null}
>
> Import takes 14 times longer to read in the same file as compared to
> ReadList.
> So, naturally, I would like to use ReadList whenever I have a .txt file to
> be read in from disk.
>
> However, the file to be read  is slightly more complicated than the one
> above (out1.txt).
> There is a string that is added to the file as the second element of a
> record.
> The first few records of the file (EWZ2.TXT below)  look like the
> following with each record
> consisting of eight elements: a number, string, word followed by five
> integer numbers for each record.
> Each record is on a separate line.
>
> EWZ2.TXT:
>
> 20000714 "iShares MSCI Brazil Index" EWZ      250        1627        1637
>        1627        1637
> 20000717 "iShares MSCI Brazil Index" EWZ      100        1730        1735
>        1730        1735
> 20000718 "iShares MSCI Brazil Index" EWZ      100        1730        1730
>        1730        1730
> 20000719 "iShares MSCI Brazil Index" EWZ      100        1686        1686
>        1686        1686
> 20000720 "iShares MSCI Brazil Index" EWZ       50        1724        1724
>        1724        1724
>
> The format of the above file is: {Number, String, Word, Number, Number,
> Number, Number, Number}
>
> If this file on disk is named "EWZ2.TXT" I am not able to use ReadList to
> read it.
> I use two format specifications within ReadList
> and neither of them works:
>
> {Number, String, Word, Number, Number, Number, Number, Number}
> and {Number, Word, Word, Number, Number, Number, Number, Number}.
>
> ReadList["EWZ2.TXT", {Number, String, Word, Number, Number, Number,
>  Number, Number}]
> ReadList["EWZ2.TXT", {Number, Word, Word, Number, Number, Number,
>  Number, Number}]
>
>
>
> {{20000714,
>  " \"iShares MSCI Brazil Index\" EWZ      250        1627        \
> 1637        1627        1637", "20000717", $Failed, EndOfFile,
>  EndOfFile, EndOfFile, EndOfFile}}
>
> {{20000714, "\"iShares", "MSCI", $Failed, EndOfFile, EndOfFile,
>  EndOfFile, EndOfFile}}
>
>
> Using "String" for the format of the second element seems to have more
> success than "Word" but, when
> read, none of the elements is separated by a comma as happened when using
> ReadList to read
> out1.txt above.
>
> "iShares MSCI Brazil Index" should be the second element of a sublist
> within
> the entire list (Table) and EWZ  (with or without quotes) should be the
> third element
> within a sublist.
>
> The defintion of a String in the function description for ReadList is
> "string terminated by a newline" which does not describe the above file.
>  (EWZ2.TXT).  If the string is moved in the file so that it is the last
> item
>  in any record, such as
>
>  20000714  EWZ      250        1627        1637        1627        1637
>  "iShares MSCI Brazil Index"
>
> then a format of {Number, Word, Number, Number, Number, Number, Number,
> String}
> in ReadList DOES work to read the file correclty.
>
> But, is there anyway to get ReadList to read the above file (EWZ2.TXT)
> with the string
> as the second item of a record so that the speed advantage of ReadList
> over Import
> can be retained?
>
> Or is there some other function I should be using other than Import and/or
> ReadList?
>
> Thank you in advance for any help you can give me.
> Don
>
>


  • Prev by Date: Re: how draw box with open front
  • Next by Date: Fitting coupled differential equations to experimental data
  • Previous by thread: Re: Using ReadList to read a string
  • Next by thread: Re: Using ReadList to read a string