Re: Mathematica is destroying my sanity....
- To: mathgroup at smc.vnet.net
- Subject: [mg53176] Re: Mathematica is destroying my sanity....
- From: Antti Penttilä@smc.vnet.net
- Date: Tue, 28 Dec 2004 06:30:15 -0500 (EST)
- Organization: University of Helsinki
- References: <cqbhb8$4kv$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Todd Allen wrote: > Hi everyone (and Happy Holidays!) > > I am trying to parse a 450 Mb text file for > certain important information and then would like to > save the smaller, parsed information in a suitable > file for later importation and futher analysis. I > have made some progress, but have stumbled yet again. > > Below is the situation: > > In[1]:= > SetDirectory["D:\\"] > In[2]:= > blastn=OpenRead["Cpara-SelfBlast120804(2).txt"] > In[3]:= > subblastn=ReadList[blastn,Record,RecordSeparators\[Rule]{{"Value"},{">"}}]; > In[7]:= > temp1=Table[StringSplit[subblastn[[i]]],{i,1,Length[subblastn]}]; > In[9]:= > temp2=Table[Partition[temp1[[i]],3],{i,1,Length[temp1]}]; > > > ***The output for temp2 is a nicely organized set of > sublists: > > Out[10]= > {{CEST-01-A-01,1120,0.0},{CEST-17-A-06,985,0.0},{CEST-04-E-04,52, > > 3e-07},{CEST-28-H-03,50,1e-06},{CEST-46-G-10,48,5e-06},{CEST-37-B-11,48, > > 5e-06},{CEST-21-D-05,48,5e-06},{CEST-21-A-11,48,5e-06},{CEST-21-A-10,48, > > 5e-06},{CEST-19-G-12,48,5e-06},{CEST-13-F-07,48,5e-06},{CEST-10-E-05,48, > > 5e-06},{CEST-04-E-08,48,5e-06},{CEST-60-C-12,46,2e-05},{CEST-59-D-02,46, > > 2e-05},{CEST-29-A-06,46,2e-05},{CEST-26-H-10,46,2e-05},{CEST-25-F-11,46, > > 2e-05},{CEST-20-C-11,46,2e-05},{CEST-19-F-10,46,2e-05},{CEST-11-E-06,46, > 2e-05}} > At this point, all the elements in the list are strings. Before exporting, you should convert strings that represent numbers into numbers. You can do this by applying ToExpression[] to simple integers or decimal numbers like "46" or "1.3". For the notations like "5e-06" you could use functions like Read[StringToStream[ "5e-06"], Number]. Another way is to explicitely tell the ReadList[]-command what it should read from one line, e.g. ReadList[blastn,{Word,Word,Number}]. -- Antti Penttilä Antti.I.Penttila at helsinki.fi.removethis