MathGroup Archive 2011

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: unable to import csv-Data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg119108] Re: unable to import csv-Data
  • From: "E. Martin-Serrano" <eMartinSerrano at telefonica.net>
  • Date: Sun, 22 May 2011 06:53:42 -0400 (EDT)

Well Hans,

I am not sure whether I understand your post.

I supplied  you the link to the ECB site, which contains itself several
links to zipped "CSV", "PDF", "XML" and other file formats  for  EUR/USD and
many other  currency exchange rates within the same files. The only thing to
do is just select programmatically  the currency pair you need and the
period of time you could be interested in. Of course by using my approach
the file must be unzipped before treated, as you say.

It is true that  the ECB updates the file every day (at 2 PM GMT) since
every day the market generates a new rate for each pair of currencies, and
the market in Europe closes at that time. Note that in the link (the site) I
posted we have the choice of downloading the whole file (since 1999) or just
the new  rates (for all the currencies for the current day; this one is  in
the file 'eurorefxref.zip'. Obviously, either, one has to insert the new
rate in the general file, or download the complete one with the new pairs
included.

I wrote myself the charting program I am using to process currency data
based on the technic   'Ichimoku Kinko Hyo', and  I wrote the program before
Wolfram released its charting set of  tools in v8 (I am reviewing them by
now, and as far as I know it is not included by Wolfram, am I wrong?). Back
to the ECB files, I had preferred download the file the way I do instead of
accessing the internet each time I run the system. So, I tried to make
independent the daily sessions with the available past data,  from the
actual availability of Internet connectivity (and traffic), though, it is
true that I could save the unloaded file for the rest of the day. But I
never try to write perfect programs and I am not good at importing (and
exporting) files  with  Mathematica.

On the other hand, I do not have any problem in dropping the superfluous
items in the files, (either headers or others) . It is worse to cope with
other  inconveniences related to the way the ECB provides data  as it lacks
the "Open", "High" and "Low" properties, since only provides the "Close"
value; still, so does the FED and never in CSV format as far as I know.
Bloomberg and Forex data poses another kind of problems.

Anyway, I will have a look at your proposal.

Many thanks

Emilio.

P.S.  I would be interested in modifying some aspects of the Candlestick
tools to include some of my stuff. But before I might go into the details of
these new tools already available.

-----Mensaje original-----
De: Hans Michel [mailto:hmichel at cox.net]
Enviado el: s=E1bado, 21 de mayo de 2011 17:39
Para: 'E. Martin-Serrano'; mathgroup at smc.vnet.net
Asunto: RE: [mg119069] Re: unable to import csv-Data

Martin-Serrano:

It would have been nicer if you supplied the actual URI to the historical
file instead of having me hunt for it
http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip?7864d71f16dd2ab9c90332
18b84780a7

More generally:
http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip

Seems the file will change every day. I would think the GUID may be a way to
get back that date range, but without it you get most current available
date. This is of course a guess. I have used similar methods for clients but
used simple number to delineate past dates [?10 means ten days from current
date server time not UTC].

This will work in Mathematica.

Import["http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip";,
"eurofxref-hist.csv"]
(Would not use it for very large zip files)

It's all good when the data provider has provided a format in a comma
separated value, and the values are clean and all the data is well formatted
and aligned so null value are "N/A" etc.

If you have time go into your csv file and replace all the commas with
semi-colons, change the "." to "," and then see what happens when you try
Import with the method you demonstrate. How would you import your data
without the header and you can't drop First after import?

CSV files are not Excel files, they did not invent this format (I don't
think so, can't find support for this statement). Microsoft may have
hijacked this mime type to show an Excel icon. See

http://tools.ietf.org/html/rfc4180

http://en.wikipedia.org/wiki/Comma-separated_values

I prefer a quoted csv file with escapes. It is more portable.

Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" -> ";",
  "CharacterEncoding" -> "ASCII", "HeaderLines" -> 1,
  "EmptyField" -> "", "RepeatedSeparators" -> False,
  "DateStringFormat" -> {"Year", "-", "Month", "-", "Day"},
  "NumberPoint" -> ","}]

The above may look very long and foreboding but Mathematica is giving us an
opportunity to define the (E)BNF for the parser (DateStringFormat excepted).

Hans

-----Original Message-----
From: E. Martin-Serrano [mailto:eMartinSerrano at telefonica.net]
Sent: Saturday, May 21, 2011 5:46 AM
To: mathgroup at smc.vnet.net
Subject: [mg119108] [mg119069] Re: unable to import csv-Data

Right now I am importing  CSV (Excel) files as it follows in  Mathematica
V8.01. Windows7.

(* import data form ECB European Central  Bank currency files *)

NotebookDirectory[];
   ratesfile =  ToFileName[NotebookDirectory[], "eurofxref-hist.csv"];
Print["rates file name - > ",   ratesfile];
cdata = Import[ratesfile, "Data"];

The file  "eurofxref-hist.csv" can be found at

http://www.ecb.int/stats/exchange/eurofxref/html/index.en.html

Where you can download a zipped CSV file. After unzipped the file
"eurofxref-hist.csv" can be imported easily  as above.

Regards.

-----Mensaje original-----
De: Andre Koppel [mailto:akoppel at akso.de] Enviado el: viernes, 20 de mayo de
2011 13:19
Para: mathgroup at smc.vnet.net
Asunto: [mg119056] Re: unable to import csv-Data

Hi Hans,
I have tried your Import-Parameters. With my complete dataset it didn't
work.
The result was a strange mix of cells.
Currently I am reading the file completely doing the convertion manually by
using resgular expressions.
Even if the csv-import would work, it's horribly slow (more than 1000
Seconds with you Import-Parameters).
It would be nice, if there would exist a csv-Import-Function where there
could be given a list that describes the contens of the columns. In such
case Import should run much more faster producing better results.
Thank you for your suggestion.
Andre

Am 19.05.2011 15:46, schrieb Hans Michel:
> Andre:
> The following does work
>
> Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" ->  ";",
>    "CharacterEncoding" ->  "ASCII", "HeaderLines" ->  1,
>    "EmptyField" ->  "", "RepeatedSeparators" ->  False,
>    "DateStringFormat" ->  {"Year", "-", "Month", "-", "Day"}}]
>
> See the Options for "Table" import
>
> ref/format/Table
>
> Hans
>
> -----Original Message-----
> From: Andre Koppel [mailto:akoppel at akso.de]
> Sent: Wednesday, May 18, 2011 6:18 AM
> To: mathgroup at smc.vnet.net
> Subject: [mg118980] unable to import csv-Data
>
> Hello to all,
>
> I am trying to import some data from a csv-file. But I am absolutely
> unable to get any usefull result.
> I have tried several options to do formating during input, but in
> every case Mathematica 8 puts several csv-columns into one
> result-column.
> Because the csv-data contains germany encoding, I have tried several
> conversion options, but nothing helps.
> Here is a snapshot of the csv-data (one headline two datalines):
> --------------------- cut here ------------------------
> ID;KONTO_NR;KONTO_BEZ;BELEG_DAT;BELEG_NR;GKTO_NR;GKTO_BEZ;BU_TEXT;SOLL
> ;HABEN
> ;Buchsaldo;WAEHRUNG;Faelligkeit;Anfangsbestand;Ausgeblendet;Changed;In
> soBase
> User;BuJahr
> 1;;;;;;;;7807477,41;6986382,79;,00;;;False;False;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> 2;D_60004;Jeske, Norbert 23966 Hof
> Triwalk;2008-01-01;;S_09008;Vortrag;EB-Werte durch AIS TaxAudit
> berechnet und erstellt;387,37;,00;387,37;EUR;;True;True;2010-11-01
> 10:24:09.997;KDLB\Conrad;
> --------------------- cut here ------------------------ I have tried
> the following import-command (and several versions of it), but did not
> get useful import-result:
> imp = Import["test.csv", "Table", "FieldSeparators" ->   ";",
>      "DateStringFormat" ->   { "Year", "-", "Month", "-", "Day"},
>      "CharacterEncoding" ->   "ASCII", "HeaderLines" ->   1] ;
>
> For me it looks like the CSV-Importer is unable to detect NULL-Values
> (;;)?!?
>
> By the way reading the data into excel, writing a resulting xls-file
> and importing the xls-file into mathematica works out of the box, but
> I can't go this way because there are more than 200000 datalines, and
> Excel did not support such a great amount and Mathematica was unable
> to import Excel-2010-Formated xlsx-Data (ods didn't works also because
> of the great amount of data).
>
> Any help would be highly appreciated
> Kind regards
> Andre
>


--
Andre Koppel Software GmbH
Prinz-Handjery-Str. 38
14167 Berlin
Tel.: (+4930) 810 09 190
Fax: (+4930) 326 01 046
www.invep.de
www.akso.de

Eingetragen beim Amtsgericht
Berlin Charlottenburg HRB92600
GeschE4ftsf==FChrer Andre Koppel



  • Prev by Date: Re: Overloading functions
  • Next by Date: there's shorter calculation time for motre datasets
  • Previous by thread: Re: unable to import csv-Data
  • Next by thread: Re: unable to import csv-Data