Re: Using ReadList to read a string
- To: mathgroup at smc.vnet.net
- Subject: [mg83818] Re: [mg83781] Using ReadList to read a string
- From: "Thomas Dowling" <thomasgdowling at gmail.com>
- Date: Sat, 1 Dec 2007 05:46:35 -0500 (EST)
- References: <200711301023.FAA06237@smc.vnet.net>
Hello,
I need to update my previous post.
1. It is, in fact, quite easy to import a comma-delimited file in the form
{Number, Number}. The following
command, for example, solves the problem outlined in the previous post (how
to input a file containing
{x, time} data where the data are comma-delimited):
Partition[ReadList["/datacom.txt", Number, RecordSeparators -> {","}],
2]
and gives the following output
{{1.24, 0.00161925}, {1.25, 0.00162431}, {1.26, 0.00161994}, {1.27,
0.00161719}, {1.28, 0.00161219}, {1.29, 0.00160894}, {1.3,
0.00161663}, {1.31, 0.00161956}, {1.32, 0.00162194}, {1.33,
0.00161781}, {1.34, 0.001615}, {1.35, 0.00160962}, {1.36,
0.00161806}, {1.37, 0.00162575}, {1.38, 0.00162256}, {1.39,
0.00161581}, {1.4, 0.00161575}, {1.41, 0.00160694}, {1.42,
0.00161869}, {1.43, 0.00161644}, {1.44, 0.00162231}, {1.45,
0.00161681}, {1.46, 0.00161812}, {1.47, 0.00160969}, {1.48,
0.00161875}, {1.49, 0.00162512}, {1.5, 0.00162319}, {1.51,
0.0016135}, {1.52, 0.00161856}, {1.53, 0.00161231}, {1.54,
0.00161887}}
2. The above will not solve Don's problem, but if the hitch is due to
comma-delimited data the
following might work:
list4 = Partition[
ToExpression[
StringReplace[
ReadList["/EWZ2.txt", Record, RecordSeparators -> {","}],
Whitespace -> ""]], 7]
gives the following output:
{{20000714, "iSharesMSCIBrazilIndex", EWZ, 250, 1627, 1637,
1627}, {163720000717, "iSharesMSCIBrazilIndex", EWZ, 100, 1730,
1735, 1730}, {173520000718, "iSharesMSCIBrazilIndex", EWZ, 100,
1730, 1730, 1730}, {173020000719, "iSharesMSCIBrazilIndex", EWZ,
100, 1686, 1686, 1686}, {168620000720, "iSharesMSCIBrazilIndex",
EWZ, 50, 1724, 1724, 1724}}
AND
Map[Head, list4, {2}]
gives the following output :
{{Integer, String, Symbol, Integer, Integer, Integer,
Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
Integer}, {Integer, String, Symbol, Integer, Integer, Integer,
Integer}}
I am assuming that EWZ2.txt is comma-delimited.
3. There is an excellent tutorial 'How do I read comma-delimited numbers
into Mathematica?'
available at:
http://support.wolfram.com/mathematica/kernel/files/csv3.html
I suppose there is a lesson for me in there somewhere!
I'd be interested to know if the above does the job, and in any other
suggestions.
Tom Dowling
On Nov 30, 2007 10:23 AM, Donald DuBois <donabc at comcast.net> wrote:
> Hello,
>
> I am trying to get ReadList to read a string in a text file (filename.txt
> ).
>
> I would like NOT to have use Import because it is MUCH slower in reading
> a text file than ReadList is. For example:
>
> (1) a file with 50,000 records can be created
> (2) Exported to disk
> (3) read by ReadList[...] and
> (4) read by Import[...]
>
>
> dataFile1 =
> Table[{2001, "nameA", "symbolA",
> 15.5}, {50000}]; Export["out1.txt", dataFile1, "Table"];
>
> AbsoluteTiming[
> out1ReadList = ReadList["out1.txt", {Number, Word, Word, Number}];]
>
> AbsoluteTiming[out1Import = Import["out1.txt", "Table"];]
>
> {0.1718750, Null}
>
> {2.4375000, Null}
>
> Import takes 14 times longer to read in the same file as compared to
> ReadList.
> So, naturally, I would like to use ReadList whenever I have a .txt file to
> be read in from disk.
>
> However, the file to be read is slightly more complicated than the one
> above (out1.txt).
> There is a string that is added to the file as the second element of a
> record.
> The first few records of the file (EWZ2.TXT below) look like the
> following with each record
> consisting of eight elements: a number, string, word followed by five
> integer numbers for each record.
> Each record is on a separate line.
>
> EWZ2.TXT:
>
> 20000714 "iShares MSCI Brazil Index" EWZ 250 1627 1637
> 1627 1637
> 20000717 "iShares MSCI Brazil Index" EWZ 100 1730 1735
> 1730 1735
> 20000718 "iShares MSCI Brazil Index" EWZ 100 1730 1730
> 1730 1730
> 20000719 "iShares MSCI Brazil Index" EWZ 100 1686 1686
> 1686 1686
> 20000720 "iShares MSCI Brazil Index" EWZ 50 1724 1724
> 1724 1724
>
> The format of the above file is: {Number, String, Word, Number, Number,
> Number, Number, Number}
>
> If this file on disk is named "EWZ2.TXT" I am not able to use ReadList to
> read it.
> I use two format specifications within ReadList
> and neither of them works:
>
> {Number, String, Word, Number, Number, Number, Number, Number}
> and {Number, Word, Word, Number, Number, Number, Number, Number}.
>
> ReadList["EWZ2.TXT", {Number, String, Word, Number, Number, Number,
> Number, Number}]
> ReadList["EWZ2.TXT", {Number, Word, Word, Number, Number, Number,
> Number, Number}]
>
>
>
> {{20000714,
> " \"iShares MSCI Brazil Index\" EWZ 250 1627 \
> 1637 1627 1637", "20000717", $Failed, EndOfFile,
> EndOfFile, EndOfFile, EndOfFile}}
>
> {{20000714, "\"iShares", "MSCI", $Failed, EndOfFile, EndOfFile,
> EndOfFile, EndOfFile}}
>
>
> Using "String" for the format of the second element seems to have more
> success than "Word" but, when
> read, none of the elements is separated by a comma as happened when using
> ReadList to read
> out1.txt above.
>
> "iShares MSCI Brazil Index" should be the second element of a sublist
> within
> the entire list (Table) and EWZ (with or without quotes) should be the
> third element
> within a sublist.
>
> The defintion of a String in the function description for ReadList is
> "string terminated by a newline" which does not describe the above file.
> (EWZ2.TXT). If the string is moved in the file so that it is the last
> item
> in any record, such as
>
> 20000714 EWZ 250 1627 1637 1627 1637
> "iShares MSCI Brazil Index"
>
> then a format of {Number, Word, Number, Number, Number, Number, Number,
> String}
> in ReadList DOES work to read the file correclty.
>
> But, is there anyway to get ReadList to read the above file (EWZ2.TXT)
> with the string
> as the second item of a record so that the speed advantage of ReadList
> over Import
> can be retained?
>
> Or is there some other function I should be using other than Import and/or
> ReadList?
>
> Thank you in advance for any help you can give me.
> Don
>
>