MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Importing text files

  • To: mathgroup at
  • Subject: [mg86913] Re: Importing text files
  • From: michael.p.croucher at
  • Date: Wed, 26 Mar 2008 04:52:22 -0500 (EST)
  • References: <fsa5h5$aef$>

On 25 Mar, 06:19, "Coleman, Mark" <Mark.Cole... at>
> Greetings,
> I'm using Mathematica v6.02 to import a large comma delimited text file (CSV
> format). The file is about 700,000 records and takes 241 Mb of space
> according to ByteCount. The file is a mix of real and string characters.
> I've imported much smaller versions of this using the built-in support
> for Excel XLS format without any difficulties.
> For the larger file, the Import appears to work fine except that all of
> the string elements import with quotation marks, i.e., if you look at
> the full form, all of the string elements are expressed as
>  {"\"YES\"",.....}
> With the \ character intact. This is the first I've bumped into this
> particular issue using Import.
> Two questions: First, is there a way that I can remove the "\"
> characters as part of the Import command? And second, if these
> characters cannot be removed during the Import process, can someone
> offer an efficient way to remove them from the imported list?
> Thanks,
> -Mark


The behavior of Import has changed from 6.0.1 to 6.0.2 regarding csv
files.  In old versions, a string that is surrounded in quotes such as
"hello" was imported as "hello" but now it is imported as "\"hello

Say I have a csv file called Book1.csv with the following data


In[2]:= book = Import["Book1.csv"]

Out[2]= {{"\"hello\"", 1}, {"test", 2}, {"bode", 3}, {"hehehe",
  4}, {"\"oidjaojf\"", 5}, {"cfoiuhsrfipvuh", 6}, {"fojewhnfrvo",
  7}, {"\"werijfbwpiufv\"", 8}, {"nhjawhb", 9}, {"\"cvijweqbv\"",
  10}, {"vwjiebv", 11}}

If you don't want the escaped quotes (\") you can strip them out as

In[3]:=f = If[StringQ[#], StringReplace[#, "\"" -> ""], #] &;

In[4]:= Map[f, book, {2}]

Out[4]= {{"hello", 1}, {"test", 2}, {"bode", 3}, {"hehehe",
  4}, {"oidjaojf", 5}, {"cfoiuhsrfipvuh", 6}, {"fojewhnfrvo",
  7}, {"werijfbwpiufv", 8}, {"nhjawhb", 9}, {"cvijweqbv",
  10}, {"vwjiebv", 11}}

I don't know if this is the most efficient way but it does the job.
Hope it helps,

  • Prev by Date: Re: Tally
  • Next by Date: Re: Intersection of 2D Surfaces in 3D
  • Previous by thread: Re: Importing text files
  • Next by thread: Basic plotting of an evaluated function