Re: Using ReadList to read a string
- To: mathgroup at smc.vnet.net
- Subject: [mg83818] Re: [mg83781] Using ReadList to read a string
- From: "Thomas Dowling" <thomasgdowling at gmail.com>
- Date: Sat, 1 Dec 2007 05:46:35 -0500 (EST)
- References: <200711301023.FAA06237@smc.vnet.net>
Hello, I need to update my previous post. 1. It is, in fact, quite easy to import a comma-delimited file in the form {Number, Number}. The following command, for example, solves the problem outlined in the previous post (how to input a file containing {x, time} data where the data are comma-delimited): Partition[ReadList["/datacom.txt", Number, RecordSeparators -> {","}], 2] and gives the following output {{1.24, 0.00161925}, {1.25, 0.00162431}, {1.26, 0.00161994}, {1.27, 0.00161719}, {1.28, 0.00161219}, {1.29, 0.00160894}, {1.3, 0.00161663}, {1.31, 0.00161956}, {1.32, 0.00162194}, {1.33, 0.00161781}, {1.34, 0.001615}, {1.35, 0.00160962}, {1.36, 0.00161806}, {1.37, 0.00162575}, {1.38, 0.00162256}, {1.39, 0.00161581}, {1.4, 0.00161575}, {1.41, 0.00160694}, {1.42, 0.00161869}, {1.43, 0.00161644}, {1.44, 0.00162231}, {1.45, 0.00161681}, {1.46, 0.00161812}, {1.47, 0.00160969}, {1.48, 0.00161875}, {1.49, 0.00162512}, {1.5, 0.00162319}, {1.51, 0.0016135}, {1.52, 0.00161856}, {1.53, 0.00161231}, {1.54, 0.00161887}} 2. The above will not solve Don's problem, but if the hitch is due to comma-delimited data the following might work: list4 = Partition[ ToExpression[ StringReplace[ ReadList["/EWZ2.txt", Record, RecordSeparators -> {","}], Whitespace -> ""]], 7] gives the following output: {{20000714, "iSharesMSCIBrazilIndex", EWZ, 250, 1627, 1637, 1627}, {163720000717, "iSharesMSCIBrazilIndex", EWZ, 100, 1730, 1735, 1730}, {173520000718, "iSharesMSCIBrazilIndex", EWZ, 100, 1730, 1730, 1730}, {173020000719, "iSharesMSCIBrazilIndex", EWZ, 100, 1686, 1686, 1686}, {168620000720, "iSharesMSCIBrazilIndex", EWZ, 50, 1724, 1724, 1724}} AND Map[Head, list4, {2}] gives the following output : {{Integer, String, Symbol, Integer, Integer, Integer, Integer}, {Integer, String, Symbol, Integer, Integer, Integer, Integer}, {Integer, String, Symbol, Integer, Integer, Integer, Integer}, {Integer, String, Symbol, Integer, Integer, Integer, Integer}, {Integer, String, Symbol, Integer, Integer, Integer, Integer}} I am assuming that EWZ2.txt is comma-delimited. 3. There is an excellent tutorial 'How do I read comma-delimited numbers into Mathematica?' available at: http://support.wolfram.com/mathematica/kernel/files/csv3.html I suppose there is a lesson for me in there somewhere! I'd be interested to know if the above does the job, and in any other suggestions. Tom Dowling On Nov 30, 2007 10:23 AM, Donald DuBois <donabc at comcast.net> wrote: > Hello, > > I am trying to get ReadList to read a string in a text file (filename.txt > ). > > I would like NOT to have use Import because it is MUCH slower in reading > a text file than ReadList is. For example: > > (1) a file with 50,000 records can be created > (2) Exported to disk > (3) read by ReadList[...] and > (4) read by Import[...] > > > dataFile1 = > Table[{2001, "nameA", "symbolA", > 15.5}, {50000}]; Export["out1.txt", dataFile1, "Table"]; > > AbsoluteTiming[ > out1ReadList = ReadList["out1.txt", {Number, Word, Word, Number}];] > > AbsoluteTiming[out1Import = Import["out1.txt", "Table"];] > > {0.1718750, Null} > > {2.4375000, Null} > > Import takes 14 times longer to read in the same file as compared to > ReadList. > So, naturally, I would like to use ReadList whenever I have a .txt file to > be read in from disk. > > However, the file to be read is slightly more complicated than the one > above (out1.txt). > There is a string that is added to the file as the second element of a > record. > The first few records of the file (EWZ2.TXT below) look like the > following with each record > consisting of eight elements: a number, string, word followed by five > integer numbers for each record. > Each record is on a separate line. > > EWZ2.TXT: > > 20000714 "iShares MSCI Brazil Index" EWZ 250 1627 1637 > 1627 1637 > 20000717 "iShares MSCI Brazil Index" EWZ 100 1730 1735 > 1730 1735 > 20000718 "iShares MSCI Brazil Index" EWZ 100 1730 1730 > 1730 1730 > 20000719 "iShares MSCI Brazil Index" EWZ 100 1686 1686 > 1686 1686 > 20000720 "iShares MSCI Brazil Index" EWZ 50 1724 1724 > 1724 1724 > > The format of the above file is: {Number, String, Word, Number, Number, > Number, Number, Number} > > If this file on disk is named "EWZ2.TXT" I am not able to use ReadList to > read it. > I use two format specifications within ReadList > and neither of them works: > > {Number, String, Word, Number, Number, Number, Number, Number} > and {Number, Word, Word, Number, Number, Number, Number, Number}. > > ReadList["EWZ2.TXT", {Number, String, Word, Number, Number, Number, > Number, Number}] > ReadList["EWZ2.TXT", {Number, Word, Word, Number, Number, Number, > Number, Number}] > > > > {{20000714, > " \"iShares MSCI Brazil Index\" EWZ 250 1627 \ > 1637 1627 1637", "20000717", $Failed, EndOfFile, > EndOfFile, EndOfFile, EndOfFile}} > > {{20000714, "\"iShares", "MSCI", $Failed, EndOfFile, EndOfFile, > EndOfFile, EndOfFile}} > > > Using "String" for the format of the second element seems to have more > success than "Word" but, when > read, none of the elements is separated by a comma as happened when using > ReadList to read > out1.txt above. > > "iShares MSCI Brazil Index" should be the second element of a sublist > within > the entire list (Table) and EWZ (with or without quotes) should be the > third element > within a sublist. > > The defintion of a String in the function description for ReadList is > "string terminated by a newline" which does not describe the above file. > (EWZ2.TXT). If the string is moved in the file so that it is the last > item > in any record, such as > > 20000714 EWZ 250 1627 1637 1627 1637 > "iShares MSCI Brazil Index" > > then a format of {Number, Word, Number, Number, Number, Number, Number, > String} > in ReadList DOES work to read the file correclty. > > But, is there anyway to get ReadList to read the above file (EWZ2.TXT) > with the string > as the second item of a record so that the speed advantage of ReadList > over Import > can be retained? > > Or is there some other function I should be using other than Import and/or > ReadList? > > Thank you in advance for any help you can give me. > Don > >