Re: how to quickly read a >10MB big file
- To: mathgroup at smc.vnet.net
- Subject: [mg72925] Re: how to quickly read a >10MB big file
- From: Bill Rowe <readnewsciv at sbcglobal.net>
- Date: Thu, 25 Jan 2007 07:15:31 -0500 (EST)
>The format of the file is that five note lines followed by a block >of data (6 columns * 100000 lines). It looks like as below: >-------------------------------------------------------------- >The file was generated on Jan-01-2007 >ParameterA=0.20998977 ParameterB=-2323.898780 ParameterC=1223 >the full output is: >-7.9777019460E-03 5.8979296313E-03 -5.8992690654E-02 >-1.9555038170E-03 -0.2143438800 0.9835566699 9.5788225640E-02 >-1.6666155312E-02 -2.3570413269E-02 8.4937134986E-04 >-0.1289696421 0.9813171342 6.7266728621E-02 -2.7685295289E-02 >4.8717250310E-02 1.5101454940E-02 -0.1758737132 0.9917945596 >... ... ... >-------------------------------------------------------------- >My PC has a Pentium4 CPU and 512MB memory. I have used "Import" >(using type Table), "ReadList" and "FindList", but all of them were >very slow. Import will clearly read the data but is slow because it does a lot of checking of data types for you. This is required to allow Import to deal with mixed data types. FindList is not intended to read in large data files unless you want to work with strings. And even then it is more efficient to use ReadList or Read if you are going to read the entire data file. The simplest way to efficiently read such a large file is to use an editor to put the five lines of notes in a separate file leaving just numbers to be read in by ReadList. Then, the time to read the data should be as quick as your machine can manage. But you indicate your machine doesn't have much memory for a modern operating system. I suspect with large files you will find Mathematica has to use virtual memory which will definitely slow things down. Here is the output of a fresh session where I used Mathematica on my machine to read in a large data file; In[1]:= a=MemoryInUse[]; In[2]:= Timing[data=ReadList["/ Users/browe/Desktop/IOC Analysis/Vacuum Test II Data/Edge Data \ III.txt",Number];] Out[2]= {13.0222 Second,Null} In[3]:= MemoryInUse[]-a Out[3]= 129422112 In[4]:= Length@data Out[4]= 7636380 As you can see, I read in 760,000 numbers in about 13 seconds. But also note, this resulted in an increase in the amount of RAM Mathematica used by ~130 MB. For me, this is not an issue since I have 2 GB of RAM installed on my machine. I am quite certain if I could even run Mathematica with everything else in 512 MB, reading this much data in on the same machine would take much more than 13 seconds. -- To reply via email subtract one hundred and four