[Date Index]
[Thread Index]
[Author Index]
Re: help! to input data...
*To*: mathgroup at smc.vnet.net
*Subject*: [mg4211] Re: help! to input data...
*From*: ianc (Ian Collier)
*Date*: Tue, 18 Jun 1996 03:25:06 -0400
*Organization*: Wolfram Research, Inc.
*Sender*: owner-wri-mathgroup at wolfram.com
In article <4pit53$jk5 at dragonfly.wolfram.com>, tcdoe+ at pitt.edu (todd c.
doehring) wrote:
> well, silly me.
> I thought i was pretty good at the basics of mathematica, but today i
> tried to input some numerical data that was comma delimited. i.e.
>
> 456,-45,21,0,5
> 43,25,3,66,65
> ...
> ...
> ...
>
> it is a large data file (10Mb+) with 5 numbers per line. i can't
> convert the commas to tabs easily.
> i tried simply using:
>
> data=ReadList["c:\micro.dat", Number, RecordLists->True];
>
> then:
>
> data=ReadList["c:\micro.dat", RecordLists->True,
> WordSeparators->{","}];
>
> and
>
> data=ReadList["c:\micro.dat", Number, RecordLists->True,
> TokenWords->","];
>
> but all i ever get is the first number and an error such as:
>
> Read::readn: Syntax error reading a real number from c:\micro.dat.
>
> would someone please tell me the best way to input this data file??
> it's driving me crazy that i can't figure out something so basic.
>
> thanks,
> todd
The following is taken from the Technical Support FAQ area of
Wolfram Research's web pages (http://www.wolfram.com/support/).
How do I read comma-delimited data into Mathematica?
Mathematica's ReadList command does not have any built-in options to
read comma-separated data. However, we can still use ReadList in
combination with other functions to do this. Here is an example data file.
In[1]:= !!data.txt
1,10,1000
2,20,2000
3,30,3000
4,40,4000
5,50,5000
6,60,6000
7,70,7000
8,80,8000
9,90,9000
10,100,10000
There are two methods for reading comma-separated data. The first
method involves reading each value as a number and each comma (or
newline character) as a string. You then discard the strings, and are left
with only the values.
We start with the ReadList command.
In[2]:= data = ReadList["data.txt",{Number,Character}]
Out[2]= {{1, ,}, {10, ,}, {1000, }, {2, ,}, {20, ,}, {2000, }, {3, ,},
> {30, ,}, {3000, }, {4, ,}, {40, ,}, {4000, }, {5, ,}, {50, ,}, {5000, },
> {6, ,}, {60, ,}, {6000, }, {7, ,}, {70, ,}, {7000, }, {8, ,}, {80, ,},
> {8000, }, {9, ,}, {90, ,}, {9000, }, {10, ,}, {100, ,}, {10000, }}
Notice that each number is paired up with a character following it (either a
comma or a newline character). We can use the InputForm command to
verify this.
In[3]:= InputForm[data]
Out[3]//InputForm=
{{1, ","}, {10, ","}, {1000, "\n"}, {2, ","}, {20, ","}, {2000, "\n"},
{3, ","}, {30, ","}, {3000, "\n"}, {4, ","}, {40, ","}, {4000, "\n"},
{5, ","}, {50, ","}, {5000, "\n"}, {6, ","}, {60, ","}, {6000, "\n"},
{7, ","}, {70, ","}, {7000, "\n"}, {8, ","}, {80, ","}, {8000, "\n"},
{9, ","}, {90, ","}, {9000, "\n"}, {10, ","}, {100, ","}, {10000, "\n"}}
To remove these extra characters, we need to take the first element of
every sublist. We can Map the First command to every sublist to extract the
values.
In[4]:= data = Map[First,data]
Out[4]= {1, 10, 1000, 2, 20, 2000, 3, 30, 3000, 4, 40, 4000, 5, 50, 5000, 6,
> 60, 6000, 7, 70, 7000, 8, 80, 8000, 9, 90, 9000, 10, 100, 10000}
We now have our data, but we need it in a matrix form. Since we know that
this data is in three columns, we can use the Partition command to partition
the data into equal sublists of length 3.
In[5]:= data = Partition[data,3]
Out[5]= {{1, 10, 1000}, {2, 20, 2000}, {3, 30, 3000}, {4, 40, 4000},
> {5, 50, 5000}, {6, 60, 6000}, {7, 70, 7000}, {8, 80, 8000},
> {9, 90, 9000}, {10, 100, 10000}}
The data is more clearly displayed using MatrixForm.
In[6]:= MatrixForm[data,TableSpacing->{0}]
Out[6]//MatrixForm= 1 10 1000
2 20 2000
3 30 3000
4 40 4000
5 50 5000
6 60 6000
7 70 7000
8 80 8000
9 90 9000
10 100 10000
The second method involves read each line as a string and then converting
the string into an expression.
We simply use the ReadList command with the String specification.
In[7]:= data = ReadList["data.txt",String]
Out[7]= {1,10,1000, 2,20,2000, 3,30,3000, 4,40,4000, 5,50,5000, 6,60,6000,
> 7,70,7000, 8,80,8000, 9,90,9000, 10,100,10000}
Note that the data looks like it has been read in as numbers. However, we
can see they are string by using the InputForm command.
In[8]:= InputForm[data]
Out[8]//InputForm=
{"1,10,1000", "2,20,2000", "3,30,3000", "4,40,4000", "5,50,5000, "6,60,\
6000", "7,70,7000", "8,80,8000", "9,90,9000", "10,100,10000"}
This functions convert a string (of a sequence of numbers) into a list of
numbers.
In[9]:= f[x_String] := ToExpression[StringJoin["{", x ,"}"]]
Now we Map this function to our data
In[10]:= data = Map[f,data]
Out[10]= {{1, 10, 1000}, {2, 20, 2000}, {3, 30, 3000}, {4, 40, 4000},
> {5, 50, 5000}, {6, 60, 6000}, {7, 70, 7000}, {8, 80, 8000},
> {9, 90, 9000}, {10, 100, 10000}}
to get our matrix.
This method does not requires you to know how many columns were in
your data. However, the first method has a slight speed advantage for
large data files. Here are both methods used on a 200k file (from an IBM
RS/6000 workstation).
In[11]:= Timing[ long1 = Partition[Map[First,
ReadList["long.txt",{Number,Character}]],3]; ]
Out[11]= {9.02 Second, Null}
In[12]:= Timing[ long2 = Map[ToExpression[StringJoin["{",#,"}"]]& ,
ReadList["long.txt",String]]; ]
Out[12]= {14.7 Second, Null}
In[13]:= long1 == long2
Out[13]= True
The specific URL for this question and answer is:
http://www.wolfram.com/support/InputOutput/ExternalFiles/CommaSeparatedData.html
I hope this helps.
--Ian
-----------------------------------------------------------
Ian Collier
Wolfram Research, Inc.
-----------------------------------------------------------
tel:(217) 398-0700 fax:(217) 398-0747 ianc at wolfram.com
Wolfram Research Home Page: http://www.wolfram.com/
-----------------------------------------------------------
==== [MESSAGE SEPARATOR] ====
Prev by Date:
**psfix on Mac???**
Next by Date:
**MATLAB user**
Previous by thread:
**Re: help! to input data...**
Next by thread:
**Re: help! to input data...**
| |