MathGroup Archive 2007

[Date Index] [Thread Index] [Author Index]

Search the Archive

Bad imports of data files -- extra empty lists showing up?

  • To: mathgroup at smc.vnet.net
  • Subject: [mg80571] Bad imports of data files -- extra empty lists showing up?
  • From: Curtis Osterhoudt <cfo at lanl.gov>
  • Date: Sun, 26 Aug 2007 03:08:27 -0400 (EDT)
  • Organization: LANL
  • Reply-to: cfo at lanl.gov

Hi, all, 

   I noticed this problem the other day, on different data sets, not 
thinking much of it. Then, when it cropped up again, I started to get 
worried. I'm really not sure how to think about it, and so am requesting some 
advice from the experts. I have tried rewriting this message a few times, and 
can't figure out how to state the problem very clearly, so please bear with 
me.

    I know that attachments aren't allowed, but my problem is that if I 
copy-and-paste the troublesome dataset into this message (I've tried), 
whatever formatting is causing the problem is lost. For example, I'll paste 
the data into this message, then copy it from the message to a text file, 
then save that and import it into Mathematica. The problem disappears. So if 
anyone is curious, perhaps they can email me directly and I can send some 
sample "bad" datasets.

   The data was taken using a VB program on a windows machine, and this 
version of Mathematica is running on a linux machine. However, 1) the problem 
crops up in perhaps 10 - 25% of the files so far, ALL of which were 
originally produced on a windows machine; 2) the problem does not occur in 
the same place in each file, IF it occurs at all; 3) if I re-do the import, 
and the file imports incorrectly, the problems occur at the same places in 
the file; 4) if I remove portions of the file (using a text editor, perhaps), 
the problems may occur in different spots, or the problems may disappear. 

   What I've tried:
         Import the data sets using Import["file name", "Table"]. Typically 
the datasets have ".txt" or ".dat" extensions. Some files consist of number 
triplets; some of doublets; they're all TAB-separated.
    Expected behavior: the data is imported correctly; files with n lines of m 
numbers per line should show up as tables consisting of n length-m lists. 
This is what happens most of the time.
    Actual behavior: A given file will import correctly, but with occasional 
empty lists interspersed in among the data points. For example, a 2*10^5 
length dataset has empty lists ( {} ) at seven different places in it. A 10^4 
length dataset has only one empty list. 

   So far, I've just been importing the datasets, searching for lines which do 
not contain the expected doublets or triplets, and just deleting those lines. 
But that's obviously extra work (even if Mathematica does it for me). I've 
been able to cut some of these example data files down a bit, and still 
retain the "bad" behavior. If anyone can shed some light on this for me, I'd 
much appreciate it!


-- 
==========================================================
Curtis Osterhoudt
cfo at remove_this.lanl.and_this.gov
PGP Key ID: 0x4DCA2A10
Please avoid sending me Word or PowerPoint attachments
See http://www.gnu.org/philosophy/no-word-attachments.html
==========================================================


  • Prev by Date: Re: Re: From PaddedForm to numbers?
  • Next by Date: Re: FW: Solving Nonlinear Equations
  • Previous by thread: Coordinate problems with Inset (and a bug)
  • Next by thread: Re: Bad imports of data files -- extra empty lists showing up?