MathGroup Archive: May 2006 [00022]

[Date Index] [Thread Index] [Author Index]

Re: Conditions with Statistical Functions

To: mathgroup at smc.vnet.net
Subject: [mg66162] Re: Conditions with Statistical Functions
From: "Jean-Marc Gulliet" <jeanmarc.gulliet at gmail.com>
Date: Tue, 2 May 2006 02:43:16 -0400 (EDT)
References: <16568608.1146308002406.JavaMail.root@eastrmwml01.mgt.cox.net> <e31vb8$hgd$1@smc.vnet.net> <4454EC45.8030009@gmail.com> <B242D500-B1BF-4293-BB70-3C9C562AFF4D@videotron.ca>
Sender: owner-wri-mathgroup at wolfram.com

On 5/1/06, Gregory Lypny <gregory.lypny at videotron.ca> wrote:
> Thanks again, Jean-Marc.
>
> This is elegant.  As I mentioned before, I'm facing an uphill battle with
> Mathematica's syntax, and in particular, the meaning and placement of things
> like #, #1, or #[[1]], &, @, @@, @@@, /@, /. .  It's all a bit overwhelming!
>
> I think I understand your second version better, so I'll work with it first.
>  If I'm not mistaken, /@ tells the Mean function to map over its argument
> and #1 directs that mapping to the first argument, which for Mean applied to
> a matrix is a vector.  What I never would have gotten on my own is the use
> of the second ampersand in parenthesis.  Is that meant to connect Select
> with Transpose?  I'm also not quite clear on why we need Transpose because
> Mean operates on columns anyway.
>
> Regards,
>
>  Greg
>
>
>
> On Sun, Apr 30, 2006, at 12:56 PM, Jean-Marc Gulliet wrote:
>
>
> In[3]:=
>
> Mean /@ (Transpose[lst /. x_ /; x <= 100 -> 0] /.
>
>    0 -> Sequence[])
>
>
>
>
> Out[3]=
>
>       1311  1450  527
>
> {125, ----, ----, ---}
>
>        8     9     3
>
>
>
>
> In[4]:=
>
> Mean /@ (Select[#1, #1 > 100 & ] & ) /@ Transpose[lst]
>
>
>
>
> Out[4]=
>
>       1311  1450  527
>
> {125, ----, ----, ---}
>
>        8     9     3
>
>
>
>
> Best regards,
>
> Jean-Marc
>
Hi Gregory,

Let us try to follow gradually what is going on. Say that our data set
is a 12 by 4 matrix of integer entries:

lst={{109,168,173,109},{4,143,200,90},{181,162,85,196},{30,
      108,86,34},{94,127,144,34},{199,109,195,188},{176,
    34,46,110},{95,27,160,109},{43,
    71,130,66},{56,148,109,163},{110,43,50,53},{32,34,16,95}}

First, we will focus on the second expression, which use the *Select* function.

Mean/@(Select[#1,#1>100&]&)/@Transpose[lst]

Out[3]=
      965  1111  875
{155, ---, ----, ---}
       7    7     6

If we look at the full form ? Mathematica internal representation ? of
the above expression, we see that (Select[#1, #1 > 100 & ] & ) is an
expression made of nested pure functions. Pure functions are
equivalent to anonymous functions in, say, LISP. The #n's (or
Slot[n]'s) are placeholders for variables and a pure function
definition ends by an ampersand character '&'. In our function, it is
really important to realize that the first #1 has nothing to do with
the second #1 because they are located in different function
definitions (this is clearer in the full form of the expression).

In[6]:=
FullForm[HoldForm[Mean /@ (Select[#1, #1 > 100 & ] & ) /@ Transpose[lst]]]

Out[6]//FullForm=
HoldForm[Map[Mean,
  Map[Function[Select[Slot[
      1],Function[Greater[Slot[1],100]]]],Transpose[lst]]]]

Note, we could have written the pure function with explicit ? local
?variable name such as in the line below. *Select* works on each row
and applies the test to each element.

In[10]:=
Map[Mean,Map[Function[rowvec,Select[rowvec,Function[elem,
      Greater[elem,100]]]],Transpose[lst]]]

Out[10]=
      965  1111  875
{155, ---, ----, ---}
       7    7     6

Now, let us see why we need to transpose the matrix first. Below, we
can see the original matrix. By visual inspection, we notice that the
first column has five values that are greater than 100, and the
second, third and fourth columns have seven,seven,and six values
greater than 100, respectively.

In[2]:=
TableForm[lst]

Out[2]//TableForm=
109   168   173   109

4     143   200   90

181   162   85    196

30    108   86    34

94    127   144   34

199   109   195   188

176   34    46    110

95    27    160   109

43    71    130   66

56    148   109   163

110   43    50    53

32    34    16    95

Applying the select clause to the original matrix without
transposition yields the following result: any values less than or
equal to 100 have been discarded; however, the structure of the
original matrix has been changed too. Not only do we have a collection
of row vectors of unequal lengths, but also many elements have shifted
to the left!

In[13]:=
(Select[#1,#1>100&]&)/@lst//TableForm

Out[13]//TableForm=
109   168   173   109

143   200

181   162   196

108

127   144

199   109   195   188

176   110

160   109

130

148   109   163

110
On the other hand, transposing first allows getting the correct values
in each row.

In[12]:=
(Select[#1,#1>100&]&)/@Transpose[lst]//TableForm

Out[12]//TableForm=
109   181   199   176   110

168   143   162   108   127   109   148

173   200   144   195   160   130   109

109   196   188   110   109   163

Finally, mapping *Mean* to the resulting list of lists allows to
compute the mean of each row vectors that correspond to our original
columns without the unwanted values (so using *Map* allows us to
change the behavior of Mean which is to compute column by column).

In[15]:=
Mean/@(Select[#1,#1>100&]&)/@Transpose[lst]//Trace

Well, I hope that I have not been to long and to obscure in my
explanation and that indeed I have successfully shred some light on
Mathematica programming.

Best regards,
Jean-Marc

Prev by Date: Re: Conditions with Statistical Functions

Next by Date: Re: Conditions with Statistical Functions

Previous by thread: Re: Conditions with Statistical Functions

Next by thread: Re: Conditions with Statistical Functions