Re: Basic Stat Question
- To: mathgroup at smc.vnet.net
- Subject: [mg42623] Re: Basic Stat Question
- From: google at scholar.freesurf.fr (Irasban)
- Date: Fri, 18 Jul 2003 05:25:19 -0400 (EDT)
- References: <bf5koe$mig$1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
If you have your data in the form
height_table = {{height1, age1}, {height2, age2}, {..., ...}, ...}
you can use something like that
First[ Transpose[ Select[ height_table, s < Last[#] < t &] ]]
to get a list of all heights corresponding to an age strictly
between s and t.
(Select is used to restrict the list to the couples whose last
== second element verifies the condition, then Transpose and First
are used to get ride of the ages.)
To compute its mean, if your version of Mathematica is under 5.0,
here is a crude implementation
myMean[{}] = Null;
myMean[ k_List] := (Plus@@k / Length[k])
If you have version 5, you can use the kernel Mean function.
For the gender, if you can put your list in a format like
height_table = {{height1, age1, gender1}, {height2, age2, gender2},
{..., ..., ...}, ...}
you can add further constraints in the test part of
the Select command and do something like this
First[ Transpose[ Select[ height_table, s < #[[2]] < t && #[[3]] ==
Boy &] ]]
to have the list of height of boys.
( && is for the logical AND between the two conditions )
You can either use undefined symbols to code for gender (like I
suppose
here) or directly choose a numerical convention
such as {1,2} for {Boy,Girl} (keeping 3 and other values for
possibilities
such as Klingon, Kwisatz Haderach, Tyloon, Seraphim, Platypus, ...)
Another caveat: if your ages are strictly integers, you better
change one of the < inequality in a <= in the Select clause.
To loop on all ages between two bounds you can use a Table construct:
Table[ myMean @ First[Transpose[Select[ height_table, age <= #[[2]] <
age+1 &] ]], {age, 1, 18}]
To be more rigorous, you can even do this:
Table[ {age, myMean @ First[Transpose[Select[ height_table, age <=
#[[2]] < age+1 &] ]]}, {age, 1, 18}]
Both version can be feed directly to ListPlot.
Be sure to check the following Standard Packages
Graphics`Graphics with the Histogram ploting function
Statistics`DataManipulation` for utilities to clean data, cut and add
rows and columns.
Statistics`DescriptiveStatistics` for classical statistics function
Statistics`NormalDistribution` for classical gaussian distributions
hope it helps,
Irasban
Moranresearch at aol.com wrote in message news:<bf5koe$mig$1 at smc.vnet.net>...
> I have a list l1= {{xi,yi}}
> I want to find the Mean xi where s <y < t
> for a series of s and t.
>
> So say I have the height of a million children and I want to know the average
> height at age 1, 2, 3...18. How do I do this?
>
> Also the data a field "Gender" how do I select the subset "Girls" or "Boys"
> to analyze.
> Thank you.
>
> John