MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Importing text files

  • To: mathgroup at smc.vnet.net
  • Subject: [mg86913] Re: Importing text files
  • From: michael.p.croucher at googlemail.com
  • Date: Wed, 26 Mar 2008 04:52:22 -0500 (EST)
  • References: <fsa5h5$aef$1@smc.vnet.net>

On 25 Mar, 06:19, "Coleman, Mark" <Mark.Cole... at LibertyMutual.com>
wrote:
> Greetings,
>
> I'm using Mathematica v6.02 to import a large comma delimited text file (CSV
> format). The file is about 700,000 records and takes 241 Mb of space
> according to ByteCount. The file is a mix of real and string characters.
> I've imported much smaller versions of this using the built-in support
> for Excel XLS format without any difficulties.
>
> For the larger file, the Import appears to work fine except that all of
> the string elements import with quotation marks, i.e., if you look at
> the full form, all of the string elements are expressed as
>
>  {"\"YES\"",.....}
>
> With the \ character intact. This is the first I've bumped into this
> particular issue using Import.
>
> Two questions: First, is there a way that I can remove the "\"
> characters as part of the Import command? And second, if these
> characters cannot be removed during the Import process, can someone
> offer an efficient way to remove them from the imported list?
>
> Thanks,
>
> -Mark

Hi

The behavior of Import has changed from 6.0.1 to 6.0.2 regarding csv
files.  In old versions, a string that is surrounded in quotes such as
"hello" was imported as "hello" but now it is imported as "\"hello
\"".

Say I have a csv file called Book1.csv with the following data

"hello",1
test,2
bode,3
hehehe,4
"oidjaojf",5
cfoiuhsrfipvuh,6
fojewhnfrvo,7
"werijfbwpiufv",8
nhjawhb,9
"cvijweqbv",10
vwjiebv,11

In[2]:= book = Import["Book1.csv"]

Out[2]= {{"\"hello\"", 1}, {"test", 2}, {"bode", 3}, {"hehehe",
  4}, {"\"oidjaojf\"", 5}, {"cfoiuhsrfipvuh", 6}, {"fojewhnfrvo",
  7}, {"\"werijfbwpiufv\"", 8}, {"nhjawhb", 9}, {"\"cvijweqbv\"",
  10}, {"vwjiebv", 11}}

If you don't want the escaped quotes (\") you can strip them out as
follows:

In[3]:=f = If[StringQ[#], StringReplace[#, "\"" -> ""], #] &;

In[4]:= Map[f, book, {2}]

Out[4]= {{"hello", 1}, {"test", 2}, {"bode", 3}, {"hehehe",
  4}, {"oidjaojf", 5}, {"cfoiuhsrfipvuh", 6}, {"fojewhnfrvo",
  7}, {"werijfbwpiufv", 8}, {"nhjawhb", 9}, {"cvijweqbv",
  10}, {"vwjiebv", 11}}

I don't know if this is the most efficient way but it does the job.
Hope it helps,
Mike


  • Prev by Date: Re: Tally
  • Next by Date: Re: Intersection of 2D Surfaces in 3D
  • Previous by thread: Re: Importing text files
  • Next by thread: Basic plotting of an evaluated function