Re: Excel Headers --> Mathematica
- To: mathgroup at smc.vnet.net
- Subject: [mg127491] Re: Excel Headers --> Mathematica
- From: Murta <rodrigomurtax at gmail.com>
- Date: Sun, 29 Jul 2012 03:05:01 -0400 (EDT)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Delivered-to: l-mathgroup@wolfram.com
- Delivered-to: mathgroup-newout@smc.vnet.net
- Delivered-to: mathgroup-newsend@smc.vnet.net
- References: <juqs07$66i$1@smc.vnet.net>
On Thursday, July 26, 2012 4:35:35 AM UTC-3, Bill Rowe wrote:
> On 7/25/12 at 2:31 AM, nkormanik at gmail.com (Nicholas Kormanik) wrote:
>
>
>
> >I have a large Excel .xls file. Miniature version shown below:
>
>
>
> >Test1 Test2
>
> >75 64
>
> >58 85
>
> >81 55
>
> >64 63
>
> >70 94
>
> >88 90
>
> >78 96
>
> >96 98
>
> >54 97
>
> >75 61
>
>
>
> >I would like to do some statistical analysis on the file in
>
> >Mathematica 8.0.4.
>
>
>
> >Additionally I am using Mathematica Link for Excel 3.5.
>
>
>
> I thought this software was designed to allow usage of
>
> Mathematica functions within Excel. If so, it really isn't
>
> relevant to the problem you describe below
>
>
>
> >The issue I'm now attempting to address concerns the "headers," or
>
> >"column labels."
>
>
>
> Given your file apparently has only one row of header
>
> information, probably the easiest thing to do is simple drop the
>
> first row after reading the data into Mathematica. I would do
>
> this as follows:
>
>
>
> data = Import[filename.xls,"XLS"][[1,2;;]];
>
>
>
> which assumes the data of interest is in the first worksheet of
>
> the Excel file. Alternatively, if there is an indeterminate
>
> number of header rows or I simply didn't want to count them for
>
> whatever reason and the data of interest is all numeric, I would
>
> do something like:
>
>
>
> data = Cases[Import[filename.xls,"XLS"][[1]],{_?NumericQ,_?NumericQ}];
>
>
>
> This will read the data in the first worksheet on get rid of any
>
> rows that don't import as a pair of numbers. Note, this pattern
>
> is very restrictive. Comments in any column would cause that row
>
> to be dropped. If the data was in say columns 3 and 4 of your
>
> worksheet, all data would be dropped and the result would be an
>
> empty list assigned to the Mathematica variable data.
>
>
>
> The first method is not so restrictive. But this also means if
>
> there are additional rows below the data, information in other
>
> columns etc, those rows will also get assigned to the variable
>
> data and likely will cause issues with further Mathematica computations.
Isn't simple to use Rest function?
best regards
Murta