Re: Slow Import of CSV files
- To: mathgroup at smc.vnet.net
- Subject: [mg79403] Re: Slow Import of CSV files
- From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
- Date: Thu, 26 Jul 2007 05:26:34 -0400 (EDT)
- Organization: The Open University, Milton Keynes, UK
- References: <f86ren$ph5$1@smc.vnet.net>
j.f.b.payne at tesco.net wrote: > Hello, > > Has anyone else looked at the speed of Import of CSV files? > > With an 8MB file (about 300,000 lines with three numbers per line) I > was disappointed to find that Mathematica version 6.0 Import is about > 3 times slower than version 5.2. > > Then I got an e-mail from info at wolfram.com saying > > Mathematica 6.0.1 contains: > * Improved performance of various Import and Export converters > * Enhancements to Table, CSV, TSV and MathML import > > So I tried version 6.0.1 and was more disappointed to find it is > slower still. > > Maybe other files import quickly (but I can't guess why, I didn't > think there was anything special about mine). > > Technical support suggest that Import does more work in version 6 than > version 5.2, which may well be true in general. However, for the > specific case of CSV files, perusal of version 6 Help (ref/format/CSV > under Options|Import & Export Functions) and version 5.2 Help (Import) > suggests that they handle the same things (numbers, strings, dates, > currency symbols). > > Of course you can work round CSV Import with ReadList, but Import is a > bit more convenient (if you don't have to wait 10 seconds for it to > execute). Or is there an option to set for CSV Import to speed it up? > I am just using > > Timing[rawXYZ5 = Import["Sample 5.prn", "CSV"];] > > John Payne Hi John, You should check, if it is possible, the structure and content of your csv file. I have done a quick test on my machine (Wintel with 5.2 and 6.0.1 both installed on it) and I have got some timing in agreement with WRI's claim: loading a 20MB csv file is faster with 6.0.1 rather than 5.2. (* First, I built some dummy data *) In[24]:= data = RandomReal[{$MinMachineNumber, $MaxMachineNumber}, {3 10^5, 3}]; In[25]:= Export["myfile.csv", data] Out[25]= "myfile.csv" (* Then I loaded the file in a fresh kernel *) (* Version 6.0.1 *) In[1]:= Timing[data = Import["C:\\myfile.csv"]; ] Out[1]= {17.969, Null} (* Version 5.2 *) In[1]:= Timing[data = Import["C:\\myfile.csv"]; ] Out[1]= {22.11 Second,Null} Regards, Jean-Marc