Re: Importing tab-delimited data files?
- To: mathgroup at smc.vnet.net
- Subject: [mg62688] Re: [mg62604] Importing tab-delimited data files?
- From: "Dale R. Horton" <daleh at wolfram.com>
- Date: Wed, 30 Nov 2005 22:09:02 -0500 (EST)
- References: <200511290945.EAA08728@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
On Nov 29, 2005, at 3:45 AM, AES wrote: > I create a text file "filedata" with the following content using a > text > editor, with tabs between each number or string (5 tabs per line), and > no content -- not even a space, just successive adjacent tabs -- in > the > empty slots. > > (The columns should line up if your reader uses monospaced type.) > > 11 aaa 22 bbb 33 ccc > 22 bbb 33 ccc > 33 ccc > > Opening Mathematica and using !!filedata reproduces exactly same > thing: > > 11 aaa 22 bbb 33 ccc > 22 bbb 33 ccc > 33 ccc > > Trying to follow this with > > fileDataAsViewed = !!fildata > > or > > fileDataAsViewed = % > > doesn't work, however. !!file doesn't produce output, but prints the file contents as a side- effect. This is like doing y = Print[x^2] > Using the Mathematica expression > > Import["datafile", "Table", > ConversionOptions->{"TableSeparators"->{{"\r","\n"},{"\t"}}}] > > gives: > > 11 aaa 22 bbb 33 ccc > 22 bbb 33 ccc > 33 ccc The Table format assumes that multiple consecutive separators are a single separator. That way if you use spaces to create lined up columns you don't end up with a bunch of empty fields. > Recreating the text file with a space between the tabs in the empty > slots and applying the same Import[ ] expression, however, gives the > "right" answer: > > 11 aaa 22 bbb 33 ccc > 22 bbb 33 ccc > 33 ccc This is because you now have a non-separator (the spaces) betweenn each of the separators (the tabs). > I suppose this is not exactly unexpected. The problem is, the app > that > creates the (much larger) tab-delimited filedata text file I really > want > to load into a Mathematica Table creates numerous blank cells, i.e. > adjacent and unspaced tabs. I guess I'll just have to go at it with a > smart text editor and separate adjacent tabs before trying to load it. The solution is to use the TSV (tab-separated-value) format, not the Table format. TSV treats each tab as new column. Import["datafile", "TSV"] -Dale