MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Slow Import of CSV files

  • To: mathgroup at smc.vnet.net
  • Subject: [mg79403] Re: Slow Import of CSV files
  • From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
  • Date: Thu, 26 Jul 2007 05:26:34 -0400 (EDT)
  • Organization: The Open University, Milton Keynes, UK
  • References: <f86ren$ph5$1@smc.vnet.net>

j.f.b.payne at tesco.net wrote:
> Hello,
> 
> Has anyone else looked at the speed of Import of CSV files?
> 
> With an 8MB file (about 300,000 lines with three numbers per line) I
> was disappointed to find that Mathematica version 6.0 Import is about
> 3 times slower than version 5.2.
> 
> Then I got an e-mail from info at wolfram.com saying
> 
> Mathematica 6.0.1 contains:
> * Improved performance of various Import and Export converters
> * Enhancements to Table, CSV, TSV and MathML import
> 
> So I tried version 6.0.1 and was more disappointed to find it is
> slower still.
> 
> Maybe other files import quickly (but I can't guess why, I didn't
> think there was anything special about mine).
> 
> Technical support suggest that Import does more work in version 6 than
> version 5.2, which may well be true in general.  However, for the
> specific case of CSV files, perusal of version 6 Help (ref/format/CSV
> under Options|Import & Export Functions) and version 5.2 Help (Import)
> suggests that they handle the same things (numbers, strings, dates,
> currency symbols).
> 
> Of course you can work round CSV Import with ReadList, but Import is a
> bit more convenient (if you don't have to wait 10 seconds for it to
> execute). Or is there an option to set for CSV Import to speed it up?
> I am just using
> 
> Timing[rawXYZ5 = Import["Sample 5.prn", "CSV"];]
> 
> John Payne

Hi John,

You should check, if it is possible, the structure and content of your 
csv file. I have done a quick test on my machine (Wintel with 5.2 and 
6.0.1 both installed on it) and I have got some timing in agreement with 
WRI's claim: loading a 20MB csv file is faster with 6.0.1 rather than 5.2.

(* First, I built some dummy data *)

In[24]:= data =
   RandomReal[{$MinMachineNumber, $MaxMachineNumber}, {3 10^5, 3}];

In[25]:= Export["myfile.csv", data]

Out[25]= "myfile.csv"

(* Then I loaded the file in a fresh kernel *)
(* Version 6.0.1  *)

In[1]:= Timing[data = Import["C:\\myfile.csv"]; ]

Out[1]= {17.969, Null}

(* Version 5.2 *)

In[1]:=
Timing[data = Import["C:\\myfile.csv"]; ]

Out[1]=
{22.11 Second,Null}

Regards,
Jean-Marc




  • Prev by Date: Re: Expand all cells?
  • Next by Date: Re: Re: Re: Wolfram Workbench 1.1 now available
  • Previous by thread: Slow Import of CSV files
  • Next by thread: Re: Slow Import of CSV files