MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: unable to import csv-Data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg119102] Re: unable to import csv-Data
  • From: "Hans Michel" <hmichel at cox.net>
  • Date: Sun, 22 May 2011 06:52:36 -0400 (EDT)

Martin-Serrano:

It would have been nicer if you supplied the actual URI to the historical
file instead of having me hunt for it
http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip?7864d71f16dd2ab9c90332
18b84780a7

More generally:
http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip

Seems the file will change every day. I would think the GUID may be a way to
get back that date range, but without it you get most current available
date. This is of course a guess. I have used similar methods for clients but
used simple number to delineate past dates [?10 means ten days from current
date server time not UTC].

This will work in Mathematica.

Import["http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip";,
"eurofxref-hist.csv"]
(Would not use it for very large zip files)

It's all good when the data provider has provided a format in a comma
separated value, and the values are clean and all the data is well formatted
and aligned so null value are "N/A" etc.

If you have time go into your csv file and replace all the commas with
semi-colons, change the "." to "," and then see what happens when you try
Import with the method you demonstrate. How would you import your data
without the header and you can't drop First after import?

CSV files are not Excel files, they did not invent this format (I don't
think so, can't find support for this statement). Microsoft may have
hijacked this mime type to show an Excel icon. See

http://tools.ietf.org/html/rfc4180

http://en.wikipedia.org/wiki/Comma-separated_values

I prefer a quoted csv file with escapes. It is more portable.

Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" -> ";",
  "CharacterEncoding" -> "ASCII", "HeaderLines" -> 1,
  "EmptyField" -> "", "RepeatedSeparators" -> False,
  "DateStringFormat" -> {"Year", "-", "Month", "-", "Day"},
  "NumberPoint" -> ","}]

The above may look very long and foreboding but Mathematica is giving us an
opportunity to define the (E)BNF for the parser (DateStringFormat excepted).
 
Hans 

-----Original Message-----
From: E. Martin-Serrano [mailto:eMartinSerrano at telefonica.net] 
Sent: Saturday, May 21, 2011 5:46 AM
To: mathgroup at smc.vnet.net
Subject: [mg119102] [mg119069] Re: unable to import csv-Data

Right now I am importing  CSV (Excel) files as it follows in  Mathematica
V8.01. Windows7.

(* import data form ECB European Central  Bank currency files *)

NotebookDirectory[];
   ratesfile =  ToFileName[NotebookDirectory[], "eurofxref-hist.csv"];
Print["rates file name - > ",   ratesfile];
cdata = Import[ratesfile, "Data"];

The file  "eurofxref-hist.csv" can be found at

http://www.ecb.int/stats/exchange/eurofxref/html/index.en.html

Where you can download a zipped CSV file. After unzipped the file
"eurofxref-hist.csv" can be imported easily  as above. 

Regards.

-----Mensaje original-----
De: Andre Koppel [mailto:akoppel at akso.de]
Enviado el: viernes, 20 de mayo de 2011 13:19
Para: mathgroup at smc.vnet.net
Asunto: [mg119056] Re: unable to import csv-Data

Hi Hans,
I have tried your Import-Parameters. With my complete dataset it didn't
work.
The result was a strange mix of cells.
Currently I am reading the file completely doing the convertion manually by
using resgular expressions.
Even if the csv-import would work, it's horribly slow (more than 1000
Seconds with you Import-Parameters).
It would be nice, if there would exist a csv-Import-Function where there
could be given a list that describes the contens of the columns. In such
case Import should run much more faster producing better results.
Thank you for your suggestion.
Andre

Am 19.05.2011 15:46, schrieb Hans Michel:
> Andre:
> The following does work
>
> Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" ->  ";",
>    "CharacterEncoding" ->  "ASCII", "HeaderLines" ->  1,
>    "EmptyField" ->  "", "RepeatedSeparators" ->  False,
>    "DateStringFormat" ->  {"Year", "-", "Month", "-", "Day"}}]
>
> See the Options for "Table" import
>
> ref/format/Table
>
> Hans
>
> -----Original Message-----
> From: Andre Koppel [mailto:akoppel at akso.de]
> Sent: Wednesday, May 18, 2011 6:18 AM
> To: mathgroup at smc.vnet.net
> Subject: [mg118980] unable to import csv-Data
>
> Hello to all,
>
> I am trying to import some data from a csv-file. But I am absolutely
> unable to get any usefull result.
> I have tried several options to do formating during input, but in
> every case Mathematica 8 puts several csv-columns into one
> result-column.
> Because the csv-data contains germany encoding, I have tried several
> conversion options, but nothing helps.
> Here is a snapshot of the csv-data (one headline two datalines):
> --------------------- cut here ------------------------
> ID;KONTO_NR;KONTO_BEZ;BELEG_DAT;BELEG_NR;GKTO_NR;GKTO_BEZ;BU_TEXT;SOLL
> ;HABEN
> ;Buchsaldo;WAEHRUNG;Faelligkeit;Anfangsbestand;Ausgeblendet;Changed;In
> soBase
> User;BuJahr
> 1;;;;;;;;7807477,41;6986382,79;,00;;;False;False;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> 2;D_60004;Jeske, Norbert 23966 Hof
> Triwalk;2008-01-01;;S_09008;Vortrag;EB-Werte durch AIS TaxAudit
> berechnet und erstellt;387,37;,00;387,37;EUR;;True;True;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> --------------------- cut here ------------------------ I have tried
> the following import-command (and several versions of it), but did not
> get useful import-result:
> imp = Import["test.csv", "Table", "FieldSeparators" ->   ";",
>      "DateStringFormat" ->   { "Year", "-", "Month", "-", "Day"},
>      "CharacterEncoding" ->   "ASCII", "HeaderLines" ->   1] ;
>
> For me it looks like the CSV-Importer is unable to detect NULL-Values
> (;;)?!?
>
> By the way reading the data into excel, writing a resulting xls-file
> and importing the xls-file into mathematica works out of the box, but
> I can't go this way because there are more than 200000 datalines, and
> Excel did not support such a great amount and Mathematica was unable
> to import Excel-2010-Formated xlsx-Data (ods didn't works also because
> of the great amount of data).
>
> Any help would be highly appreciated
> Kind regards
> Andre
>


--
Andre Koppel Software GmbH
Prinz-Handjery-Str. 38
14167 Berlin
Tel.: (+4930) 810 09 190
Fax: (+4930) 326 01 046
www.invep.de
www.akso.de

Eingetragen beim Amtsgericht
Berlin Charlottenburg HRB92600
GeschE4ftsf=FChrer Andre Koppel




  • Prev by Date: Defining Formatting and Notation
  • Next by Date: Re: question about plotting
  • Previous by thread: Re: unable to import csv-Data
  • Next by thread: Re: unable to import csv-Data