Re: quartiles

*To*: mathgroup at smc.vnet.net*Subject*: [mg18254] Re: [mg18214] quartiles*From*: "Tomas Garza" <tgarza at mail.internet.com.mx>*Date*: Thu, 24 Jun 1999 14:24:36 -0400*Sender*: owner-wri-mathgroup at wolfram.com

Tom De Vries [tdevries at shop.westworld.ca] wrote <snip>...<snip> salaries = > {250000,100000,60000,60000,40000,40000,40000,40000,25000,20000,200 > 00,20000, > 18000,16000,16000} <snip>...<snip> > At this point I am probably revealing my ignorance of statistics.... > > > 250000,100000,60000,60000,40000,40000, 40000 > 40000, Median > 25000,20000,20000,20000,18000,16000,16000 > > The lower quartile is the median of the values below the median, > which I get > with Mathematica > 25000,20000,20000, > 20000, > 18000,16000,16000 > > The upper quartile should be the median of the numbers above the > median, so > why is it 55000? > 250000,100000,60000, > 60000, > 40000,40000, 40000 > > Does Mathematica use some algorithm to get rid of outliers before finding > quartiles, or does it eliminate the median from the data set > before finding > the quartiles, .....? Tom, I'm afraid the problem is, you have to sort your data in increasing order before manually obtaining the quartiles -- or any such other quantile, for that matter. The ordered data is {16000, 16000, 18000, 20000, 20000, 20000, 25000, 40000, 40000, 40000, 40000, 60000, 60000, 100000, 250000} and then the sample median is any value greater than 25000 and less than or equal than 40000 (so as to leave 50% of the ordered values to its left). The third sample quartile is, again, any value between 40000 and 60000. Why Mathematica decides they have to be precisely 40000 and 55000, respectively, is a mystery to me. All this belongs in the theory of Order Statistics, a rather tricky subject. Tomas Garza Mexico City