Mathematica 9 is now available
Services & Resources / Wolfram Forums / MathGroup Archive
-----

MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Importing text files

  • To: mathgroup at smc.vnet.net
  • Subject: [mg86928] Re: Importing text files
  • From: David Bailey <dave at Remove_Thisdbailey.co.uk>
  • Date: Wed, 26 Mar 2008 04:55:19 -0500 (EST)
  • References: <fsa5h5$aef$1@smc.vnet.net>

Coleman, Mark wrote:
> Greetings,
> 
> I'm using Mathematica v6.02 to import a large comma delimited text file (CSV
> format). The file is about 700,000 records and takes 241 Mb of space
> according to ByteCount. The file is a mix of real and string characters.
> I've imported much smaller versions of this using the built-in support
> for Excel XLS format without any difficulties.
> 
> For the larger file, the Import appears to work fine except that all of
> the string elements import with quotation marks, i.e., if you look at
> the full form, all of the string elements are expressed as
> 
>  {"\"YES\"",.....}
> 
> With the \ character intact. This is the first I've bumped into this
> particular issue using Import.
> 
> Two questions: First, is there a way that I can remove the "\"
> characters as part of the Import command? And second, if these
> characters cannot be removed during the Import process, can someone
> offer an efficient way to remove them from the imported list?
> 
> Thanks,
> 
> -Mark
> 
> 
> 
Something changed between 6.0.1 and 6.0.2 when you import a CSV file 
containing character strings. The earlier version stripped off the 
quotation marks, the later one leaves them in. Since a CSV file can 
contain strings without quotes, the new way of doing things is more 
consistent, and probably counts as a bug fix.

Have you tried just stripping the quotes afterwards:

Import["something.csv"] /. x_String:>StringReplace[x,"\""->""]

If you anticipate going to even larger files, you should be aware that 
because your file contains both real and string types, it will not pack, 
and therefore occupies much more space and is slower to process than 
would otherwise be the case.

David Bailey
http://www.dbaileyconsultancy.co.uk


  • Prev by Date: Re: Match Pairs of Numbers
  • Next by Date: Re: Tagged list processing
  • Previous by thread: Importing text files
  • Next by thread: Re: Importing text files