Re: Bad imports of data files -- extra empty lists showing up?
- To: mathgroup at smc.vnet.net
- Subject: [mg80700] Re: Bad imports of data files -- extra empty lists showing up?
- From: dh <dh at metrohm.ch>
- Date: Wed, 29 Aug 2007 04:18:27 -0400 (EDT)
- References: <far9ul$4dl$1@smc.vnet.net>
Hi Curtis, have you checked that your files do not contain any invisible (control) characters? Daniel Curtis Osterhoudt wrote: > Hi, all, > > I noticed this problem the other day, on different data sets, not > thinking much of it. Then, when it cropped up again, I started to get > worried. I'm really not sure how to think about it, and so am requesting some > advice from the experts. I have tried rewriting this message a few times, and > can't figure out how to state the problem very clearly, so please bear with > me. > > I know that attachments aren't allowed, but my problem is that if I > copy-and-paste the troublesome dataset into this message (I've tried), > whatever formatting is causing the problem is lost. For example, I'll paste > the data into this message, then copy it from the message to a text file, > then save that and import it into Mathematica. The problem disappears. So if > anyone is curious, perhaps they can email me directly and I can send some > sample "bad" datasets. > > The data was taken using a VB program on a windows machine, and this > version of Mathematica is running on a linux machine. However, 1) the problem > crops up in perhaps 10 - 25% of the files so far, ALL of which were > originally produced on a windows machine; 2) the problem does not occur in > the same place in each file, IF it occurs at all; 3) if I re-do the import, > and the file imports incorrectly, the problems occur at the same places in > the file; 4) if I remove portions of the file (using a text editor, perhaps), > the problems may occur in different spots, or the problems may disappear. > > What I've tried: > Import the data sets using Import["file name", "Table"]. Typically > the datasets have ".txt" or ".dat" extensions. Some files consist of number > triplets; some of doublets; they're all TAB-separated. > Expected behavior: the data is imported correctly; files with n lines of m > numbers per line should show up as tables consisting of n length-m lists. > This is what happens most of the time. > Actual behavior: A given file will import correctly, but with occasional > empty lists interspersed in among the data points. For example, a 2*10^5 > length dataset has empty lists ( {} ) at seven different places in it. A 10^4 > length dataset has only one empty list. > > So far, I've just been importing the datasets, searching for lines which do > not contain the expected doublets or triplets, and just deleting those lines. > But that's obviously extra work (even if Mathematica does it for me). I've > been able to cut some of these example data files down a bit, and still > retain the "bad" behavior. If anyone can shed some light on this for me, I'd > much appreciate it! > >
- Follow-Ups:
- Re: Re: Bad imports of data files -- extra empty lists showing up?
- From: Curtis Osterhoudt <cfo@lanl.gov>
- Re: Re: Bad imports of data files -- extra empty lists showing up?