MathGroup Archive 2000

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: breaking up lists into intervals

  • To: mathgroup at smc.vnet.net
  • Subject: [mg23092] Re: [mg23074] breaking up lists into intervals
  • From: Daniel Lichtblau <danl at bank.wolfram.com>
  • Date: Sun, 16 Apr 2000 00:37:33 -0400 (EDT)
  • References: <200004150700.DAA18470@smc.vnet.net>
  • Sender: owner-wri-mathgroup at wolfram.com

Matt.Johnson at autolivasp.com wrote:
> 
> Dear Mathgroup:
> 
> I have many large datasets of {x,y,z} data that I wish to break into small data
> sets based on the value of x.  For example, the x value ranges from 0 to 100 and
> I want ot break up the data into 20 groups, from 0-5, 5-10, 10-15, etc.  There
> will be an unequal number of data points in each interval.  I have written a
> routine based on several Do loops to do this and it works satisfactorily.
> However, I would think that there is a way to eliminate from the data set the
> points that have already been placed in their appropriate intervals, or a
> routine that would "place" the point in the appropriate group, only having to go
> through the datasets once.  Either of these options would speed up the process.
> Currently the routine goes through each complete dataset as many times as there
> are intervals created.  Here is the current code:
> 
> Do[Do[Do[
>      If[ i-0.5 di<=dataset[j][[k,1]]<=i+0.5 di,
>      AppendTo[group[j,i],dataset[j][[k]] ]],
>      {k, Length[dataset[j]]}],
>      {i, imin, imax, di}], {j,jmax}]
> 
> There are j datasets with k points in each dataset.  i serves as the index for
> the intervals, according to the x value, with an interval size of di.
> It creates (imax-imin)/di intervals in each dataset.
> 
> Thanks for any help
> 
> Matt Johnson

Here is one method.

data = Table[Random[Real,100], {30}, {3}];
Split[Sort[data], Floor[#1[[1]]/5]==Floor[#2[[1]]/5]&]

There are asymptotically faster methods (avoiding Sort) but in practice
you will probably not get better speed from them. For example, assuming
you know the maximum size is 100 (so there are 20 bins) you can do as
below.

bin[data_] := Module[
  {bins=Table[{},{20}], indx, j, len=Length[data]},
  Do [
    indx = Floor[data[[j,1]]/5]+1;
    bins[[indx]] = {bins[[indx]], data[[j]]},
    {j, len}
    ];
  Map[Partition[Flatten[#],3]&, bins]
  ]

Note that this does not reorder within bins, unlike the method that uses
Sort. Assuming your data is all real-valued, as above, you could likely
gain speed by using Compile.

Daniel Lichtblau
Wolfram Research


  • Prev by Date: Re: Add Fonts when using Display[]
  • Next by Date: Re: breaking up lists into intervals
  • Previous by thread: breaking up lists into intervals
  • Next by thread: Re: breaking up lists into intervals