MathGroup Archive 2002

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: grouping and averaging {x,y} pairs of data

  • To: mathgroup at smc.vnet.net
  • Subject: [mg37215] Re: grouping and averaging {x,y} pairs of data
  • From: Daniel Lichtblau <danl at wolfram.com>
  • Date: Thu, 17 Oct 2002 00:08:36 -0400 (EDT)
  • References: <200210161826.OAA10621@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

"David E. Burmaster" wrote:
> 
> Dear Fellows in MathGroup,
> 
> I have a list of 17,000+ {x,y} pairs of data
> 
>         each x value is a positive integer from 1 to 100+
> 
>         each y value is a positive real number
> 
> As a *short* example, let's consider:
> 
>  data = {{3,1},{4,3},{3,2},{1,10},{4,2},{1,6},{5,2},{2,5},{7,1}}
> 
> I want to group the data by the x value and report the arithmetic average
> of the y values in each group.
> 
> For the example, i want to report:
> 
>  output = {{1,8},{2,5},{3,1.5},{4,2.5},{5,2},{6,0},{7,1}}
> 
> In this example, x=6 does not occur so i report the average y[6] = 0.
> 
> Can anyone suggest a way to do this efficiently?/
> 
> many thanks
> dave
> 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> David E. Burmaster, Ph.D.
> Alceon Corporation
> POBox 382069                 (new Box number effective 1 Sep 2001)
> Harvard Square Station
> Cambridge, MA 02238-2069     (new ZIP code effective 1 Sep 2001)
> 
> Voice   617-864-4300
> 
> Web     http://www.Alceon.com
> Email   deb at Alceon.com
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Probably most efficient would be to iterate over the list, bin, and then
average the bins.

averageByBin[data:{{_,_}..}] := Module[
  {len, binsizes, averages},
  len = Max[Map[First,data]];
  binsizes = Table[0,{len}];
  averages = Table[0,{len}];
  Map [
    (binsizes[[#[[1]]]]++;
      averages[[#[[1]]]] += #[[2]]) &, data];
  Do [If[binsizes[[j]]==0, binsizes[[j]]++], {j,len}];
  Transpose[{Range[len],N[averages]/binsizes}]
  ]

In[30]:= averageByBin[data]
Out[30]= {{1, 8.}, {2, 5.}, {3, 1.5}, {4, 2.5}, {5, 2.}, {6, 0.}, {7,
1.}}

If you separate out integer first values from real second values in the
pairs, you can enter two separate lists and take advantage of Compile to
make it faster still.


Daniel Lichtblau
Wolfram Research


  • Prev by Date: Re: grouping and averaging {x,y} pairs of data
  • Next by Date: RE: Re: re: Accuracy and Precision
  • Previous by thread: grouping and averaging {x,y} pairs of data
  • Next by thread: Re: grouping and averaging {x,y} pairs of data