MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: unable to import csv-Data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg119000] Re: unable to import csv-Data
  • From: David Bailey <dave at removedbailey.co.uk>
  • Date: Thu, 19 May 2011 07:42:57 -0400 (EDT)
  • References: <ir09t3$244$1@smc.vnet.net>

On 18/05/2011 12:17, Andre Koppel wrote:
> Hello to all,
>
> I am trying to import some data from a csv-file. But I am absolutely
> unable to get any usefull result.
> I have tried several options to do formating during input, but in every
> case Mathematica 8 puts several csv-columns
> into one result-column.
> Because the csv-data contains germany encoding, I have tried several
> conversion options, but nothing helps.
> Here is a snapshot of the csv-data (one headline two datalines):
> --------------------- cut here ------------------------
> ID;KONTO_NR;KONTO_BEZ;BELEG_DAT;BELEG_NR;GKTO_NR;GKTO_BEZ;BU_TEXT;SOLL;HABEN;Buchsaldo;WAEHRUNG;Faelligkeit;Anfangsbestand;Ausgeblendet;Changed;InsoBaseUser;BuJahr
> 1;;;;;;;;7807477,41;6986382,79;,00;;;False;False;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> 2;D_60004;Jeske, Norbert 23966 Hof
> Triwalk;2008-01-01;;S_09008;Vortrag;EB-Werte durch AIS TaxAudit
> berechnet und erstellt;387,37;,00;387,37;EUR;;True;True;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> --------------------- cut here ------------------------
> I have tried the following import-command (and several versions of it),
> but did not get useful import-result:
> imp = Import["test.csv", "Table", "FieldSeparators" ->   ";",
>      "DateStringFormat" ->   { "Year", "-", "Month", "-", "Day"},
>      "CharacterEncoding" ->   "ASCII", "HeaderLines" ->   1] ;
>
> For me it looks like the CSV-Importer is unable to detect NULL-Values (;;)?!?
>
> By the way reading the data into excel, writing a resulting xls-file and
> importing the xls-file into mathematica works out of the box,
> but I can't go this way because there are more than 200000 datalines,
> and Excel did not support such a great amount and Mathematica
> was unable to import Excel-2010-Formated xlsx-Data (ods didn't works
> also because of the great amount of data).
>
> Any help would be highly appreciated
> Kind regards
> Andre
>
I think one approach may be to:

1) Read the file into Mathematica as a list of strings with 
ReadList[file,String]

2) Merge in "\n" newlines into the list with Riffle.

3) Use StringReplace to "Anglicise" the file. You may need to use 
several nested calls to this function to make sure semicolons don't get 
replaced by commas before the original commas get replaced.

4)  Use ImportString to import the end result.

String operations are remarkably fast, so this may work more efficiently 
than you expect!

David Bailey
http://www.dbaileyconsultancy.co.uk



  • Prev by Date: Re: while loop - numerics
  • Next by Date: Request for Mathmark 8
  • Previous by thread: Re: unable to import csv-Data
  • Next by thread: Re: unable to import csv-Data