Re: Conditions with Statistical Functions
- To: mathgroup at smc.vnet.net
- Subject: [mg66162] Re: Conditions with Statistical Functions
- From: "Jean-Marc Gulliet" <jeanmarc.gulliet at gmail.com>
- Date: Tue, 2 May 2006 02:43:16 -0400 (EDT)
- References: <16568608.1146308002406.JavaMail.root@eastrmwml01.mgt.cox.net> <e31vb8$hgd$1@smc.vnet.net> <4454EC45.8030009@gmail.com> <B242D500-B1BF-4293-BB70-3C9C562AFF4D@videotron.ca>
- Sender: owner-wri-mathgroup at wolfram.com
On 5/1/06, Gregory Lypny <gregory.lypny at videotron.ca> wrote:
> Thanks again, Jean-Marc.
>
> This is elegant. As I mentioned before, I'm facing an uphill battle with
> Mathematica's syntax, and in particular, the meaning and placement of things
> like #, #1, or #[[1]], &, @, @@, @@@, /@, /. . It's all a bit overwhelming!
>
> I think I understand your second version better, so I'll work with it first.
> If I'm not mistaken, /@ tells the Mean function to map over its argument
> and #1 directs that mapping to the first argument, which for Mean applied to
> a matrix is a vector. What I never would have gotten on my own is the use
> of the second ampersand in parenthesis. Is that meant to connect Select
> with Transpose? I'm also not quite clear on why we need Transpose because
> Mean operates on columns anyway.
>
> Regards,
>
> Greg
>
>
>
> On Sun, Apr 30, 2006, at 12:56 PM, Jean-Marc Gulliet wrote:
>
>
> In[3]:=
>
> Mean /@ (Transpose[lst /. x_ /; x <= 100 -> 0] /.
>
> 0 -> Sequence[])
>
>
>
>
> Out[3]=
>
> 1311 1450 527
>
> {125, ----, ----, ---}
>
> 8 9 3
>
>
>
>
> In[4]:=
>
> Mean /@ (Select[#1, #1 > 100 & ] & ) /@ Transpose[lst]
>
>
>
>
> Out[4]=
>
> 1311 1450 527
>
> {125, ----, ----, ---}
>
> 8 9 3
>
>
>
>
> Best regards,
>
> Jean-Marc
>
Hi Gregory,
Let us try to follow gradually what is going on. Say that our data set
is a 12 by 4 matrix of integer entries:
lst={{109,168,173,109},{4,143,200,90},{181,162,85,196},{30,
108,86,34},{94,127,144,34},{199,109,195,188},{176,
34,46,110},{95,27,160,109},{43,
71,130,66},{56,148,109,163},{110,43,50,53},{32,34,16,95}}
First, we will focus on the second expression, which use the *Select* function.
Mean/@(Select[#1,#1>100&]&)/@Transpose[lst]
Out[3]=
965 1111 875
{155, ---, ----, ---}
7 7 6
If we look at the full form ? Mathematica internal representation ? of
the above expression, we see that (Select[#1, #1 > 100 & ] & ) is an
expression made of nested pure functions. Pure functions are
equivalent to anonymous functions in, say, LISP. The #n's (or
Slot[n]'s) are placeholders for variables and a pure function
definition ends by an ampersand character '&'. In our function, it is
really important to realize that the first #1 has nothing to do with
the second #1 because they are located in different function
definitions (this is clearer in the full form of the expression).
In[6]:=
FullForm[HoldForm[Mean /@ (Select[#1, #1 > 100 & ] & ) /@ Transpose[lst]]]
Out[6]//FullForm=
HoldForm[Map[Mean,
Map[Function[Select[Slot[
1],Function[Greater[Slot[1],100]]]],Transpose[lst]]]]
Note, we could have written the pure function with explicit ? local
?variable name such as in the line below. *Select* works on each row
and applies the test to each element.
In[10]:=
Map[Mean,Map[Function[rowvec,Select[rowvec,Function[elem,
Greater[elem,100]]]],Transpose[lst]]]
Out[10]=
965 1111 875
{155, ---, ----, ---}
7 7 6
Now, let us see why we need to transpose the matrix first. Below, we
can see the original matrix. By visual inspection, we notice that the
first column has five values that are greater than 100, and the
second, third and fourth columns have seven,seven,and six values
greater than 100, respectively.
In[2]:=
TableForm[lst]
Out[2]//TableForm=
109 168 173 109
4 143 200 90
181 162 85 196
30 108 86 34
94 127 144 34
199 109 195 188
176 34 46 110
95 27 160 109
43 71 130 66
56 148 109 163
110 43 50 53
32 34 16 95
Applying the select clause to the original matrix without
transposition yields the following result: any values less than or
equal to 100 have been discarded; however, the structure of the
original matrix has been changed too. Not only do we have a collection
of row vectors of unequal lengths, but also many elements have shifted
to the left!
In[13]:=
(Select[#1,#1>100&]&)/@lst//TableForm
Out[13]//TableForm=
109 168 173 109
143 200
181 162 196
108
127 144
199 109 195 188
176 110
160 109
130
148 109 163
110
On the other hand, transposing first allows getting the correct values
in each row.
In[12]:=
(Select[#1,#1>100&]&)/@Transpose[lst]//TableForm
Out[12]//TableForm=
109 181 199 176 110
168 143 162 108 127 109 148
173 200 144 195 160 130 109
109 196 188 110 109 163
Finally, mapping *Mean* to the resulting list of lists allows to
compute the mean of each row vectors that correspond to our original
columns without the unwanted values (so using *Map* allows us to
change the behavior of Mean which is to compute column by column).
In[15]:=
Mean/@(Select[#1,#1>100&]&)/@Transpose[lst]//Trace
Well, I hope that I have not been to long and to obscure in my
explanation and that indeed I have successfully shred some light on
Mathematica programming.
Best regards,
Jean-Marc