Re: unable to import csv-Data
- To: mathgroup at smc.vnet.net
- Subject: [mg119108] Re: unable to import csv-Data
- From: "E. Martin-Serrano" <eMartinSerrano at telefonica.net>
- Date: Sun, 22 May 2011 06:53:42 -0400 (EDT)
Well Hans, I am not sure whether I understand your post. I supplied you the link to the ECB site, which contains itself several links to zipped "CSV", "PDF", "XML" and other file formats for EUR/USD and many other currency exchange rates within the same files. The only thing to do is just select programmatically the currency pair you need and the period of time you could be interested in. Of course by using my approach the file must be unzipped before treated, as you say. It is true that the ECB updates the file every day (at 2 PM GMT) since every day the market generates a new rate for each pair of currencies, and the market in Europe closes at that time. Note that in the link (the site) I posted we have the choice of downloading the whole file (since 1999) or just the new rates (for all the currencies for the current day; this one is in the file 'eurorefxref.zip'. Obviously, either, one has to insert the new rate in the general file, or download the complete one with the new pairs included. I wrote myself the charting program I am using to process currency data based on the technic 'Ichimoku Kinko Hyo', and I wrote the program before Wolfram released its charting set of tools in v8 (I am reviewing them by now, and as far as I know it is not included by Wolfram, am I wrong?). Back to the ECB files, I had preferred download the file the way I do instead of accessing the internet each time I run the system. So, I tried to make independent the daily sessions with the available past data, from the actual availability of Internet connectivity (and traffic), though, it is true that I could save the unloaded file for the rest of the day. But I never try to write perfect programs and I am not good at importing (and exporting) files with Mathematica. On the other hand, I do not have any problem in dropping the superfluous items in the files, (either headers or others) . It is worse to cope with other inconveniences related to the way the ECB provides data as it lacks the "Open", "High" and "Low" properties, since only provides the "Close" value; still, so does the FED and never in CSV format as far as I know. Bloomberg and Forex data poses another kind of problems. Anyway, I will have a look at your proposal. Many thanks Emilio. P.S. I would be interested in modifying some aspects of the Candlestick tools to include some of my stuff. But before I might go into the details of these new tools already available. -----Mensaje original----- De: Hans Michel [mailto:hmichel at cox.net] Enviado el: s=E1bado, 21 de mayo de 2011 17:39 Para: 'E. Martin-Serrano'; mathgroup at smc.vnet.net Asunto: RE: [mg119069] Re: unable to import csv-Data Martin-Serrano: It would have been nicer if you supplied the actual URI to the historical file instead of having me hunt for it http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip?7864d71f16dd2ab9c90332 18b84780a7 More generally: http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip Seems the file will change every day. I would think the GUID may be a way to get back that date range, but without it you get most current available date. This is of course a guess. I have used similar methods for clients but used simple number to delineate past dates [?10 means ten days from current date server time not UTC]. This will work in Mathematica. Import["http://www.ecb.int/stats/eurofxref/eurofxref-hist.zip", "eurofxref-hist.csv"] (Would not use it for very large zip files) It's all good when the data provider has provided a format in a comma separated value, and the values are clean and all the data is well formatted and aligned so null value are "N/A" etc. If you have time go into your csv file and replace all the commas with semi-colons, change the "." to "," and then see what happens when you try Import with the method you demonstrate. How would you import your data without the header and you can't drop First after import? CSV files are not Excel files, they did not invent this format (I don't think so, can't find support for this statement). Microsoft may have hijacked this mime type to show an Excel icon. See http://tools.ietf.org/html/rfc4180 http://en.wikipedia.org/wiki/Comma-separated_values I prefer a quoted csv file with escapes. It is more portable. Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" -> ";", "CharacterEncoding" -> "ASCII", "HeaderLines" -> 1, "EmptyField" -> "", "RepeatedSeparators" -> False, "DateStringFormat" -> {"Year", "-", "Month", "-", "Day"}, "NumberPoint" -> ","}] The above may look very long and foreboding but Mathematica is giving us an opportunity to define the (E)BNF for the parser (DateStringFormat excepted). Hans -----Original Message----- From: E. Martin-Serrano [mailto:eMartinSerrano at telefonica.net] Sent: Saturday, May 21, 2011 5:46 AM To: mathgroup at smc.vnet.net Subject: [mg119108] [mg119069] Re: unable to import csv-Data Right now I am importing CSV (Excel) files as it follows in Mathematica V8.01. Windows7. (* import data form ECB European Central Bank currency files *) NotebookDirectory[]; ratesfile = ToFileName[NotebookDirectory[], "eurofxref-hist.csv"]; Print["rates file name - > ", ratesfile]; cdata = Import[ratesfile, "Data"]; The file "eurofxref-hist.csv" can be found at http://www.ecb.int/stats/exchange/eurofxref/html/index.en.html Where you can download a zipped CSV file. After unzipped the file "eurofxref-hist.csv" can be imported easily as above. Regards. -----Mensaje original----- De: Andre Koppel [mailto:akoppel at akso.de] Enviado el: viernes, 20 de mayo de 2011 13:19 Para: mathgroup at smc.vnet.net Asunto: [mg119056] Re: unable to import csv-Data Hi Hans, I have tried your Import-Parameters. With my complete dataset it didn't work. The result was a strange mix of cells. Currently I am reading the file completely doing the convertion manually by using resgular expressions. Even if the csv-import would work, it's horribly slow (more than 1000 Seconds with you Import-Parameters). It would be nice, if there would exist a csv-Import-Function where there could be given a list that describes the contens of the columns. In such case Import should run much more faster producing better results. Thank you for your suggestion. Andre Am 19.05.2011 15:46, schrieb Hans Michel: > Andre: > The following does work > > Import["D:\\akk\\test.csv", "Table", {"FieldSeparators" -> ";", > "CharacterEncoding" -> "ASCII", "HeaderLines" -> 1, > "EmptyField" -> "", "RepeatedSeparators" -> False, > "DateStringFormat" -> {"Year", "-", "Month", "-", "Day"}}] > > See the Options for "Table" import > > ref/format/Table > > Hans > > -----Original Message----- > From: Andre Koppel [mailto:akoppel at akso.de] > Sent: Wednesday, May 18, 2011 6:18 AM > To: mathgroup at smc.vnet.net > Subject: [mg118980] unable to import csv-Data > > Hello to all, > > I am trying to import some data from a csv-file. But I am absolutely > unable to get any usefull result. > I have tried several options to do formating during input, but in > every case Mathematica 8 puts several csv-columns into one > result-column. > Because the csv-data contains germany encoding, I have tried several > conversion options, but nothing helps. > Here is a snapshot of the csv-data (one headline two datalines): > --------------------- cut here ------------------------ > ID;KONTO_NR;KONTO_BEZ;BELEG_DAT;BELEG_NR;GKTO_NR;GKTO_BEZ;BU_TEXT;SOLL > ;HABEN > ;Buchsaldo;WAEHRUNG;Faelligkeit;Anfangsbestand;Ausgeblendet;Changed;In > soBase > User;BuJahr > 1;;;;;;;;7807477,41;6986382,79;,00;;;False;False;2010-11-01 > 10:24:09.997;KDLB\Conrad; > 2;D_60004;Jeske, Norbert 23966 Hof > Triwalk;2008-01-01;;S_09008;Vortrag;EB-Werte durch AIS TaxAudit > berechnet und erstellt;387,37;,00;387,37;EUR;;True;True;2010-11-01 > 10:24:09.997;KDLB\Conrad; > --------------------- cut here ------------------------ I have tried > the following import-command (and several versions of it), but did not > get useful import-result: > imp = Import["test.csv", "Table", "FieldSeparators" -> ";", > "DateStringFormat" -> { "Year", "-", "Month", "-", "Day"}, > "CharacterEncoding" -> "ASCII", "HeaderLines" -> 1] ; > > For me it looks like the CSV-Importer is unable to detect NULL-Values > (;;)?!? > > By the way reading the data into excel, writing a resulting xls-file > and importing the xls-file into mathematica works out of the box, but > I can't go this way because there are more than 200000 datalines, and > Excel did not support such a great amount and Mathematica was unable > to import Excel-2010-Formated xlsx-Data (ods didn't works also because > of the great amount of data). > > Any help would be highly appreciated > Kind regards > Andre > -- Andre Koppel Software GmbH Prinz-Handjery-Str. 38 14167 Berlin Tel.: (+4930) 810 09 190 Fax: (+4930) 326 01 046 www.invep.de www.akso.de Eingetragen beim Amtsgericht Berlin Charlottenburg HRB92600 GeschE4ftsf==FChrer Andre Koppel