Re: breaking up lists into intervals

• To: mathgroup at smc.vnet.net
• Subject: [mg23144] Re: breaking up lists into intervals
• From: "Allan Hayes" <hay at haystack.demon.co.uk>
• Date: Thu, 20 Apr 2000 03:21:03 -0400 (EDT)
• References: <8d964f\$ie1@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

```Fred Simons and Daniel Lichblau both posted the solution

bin1[data_] := Split[Sort[data], Floor[#1[[1]]/5] == Floor[#2[[1]]/5] &]

Daniel remarked that some of the extra sorting might not be wanted.
The following is rather more than twice as fast on my timings and bins
without any extra sorting. The result from the bin1 can be quickly obtained.

bin2[data_] :=
Part[data, #[[All, 2]]] & /@
Split[Sort[Transpose[{data[[All, 1]]/5 // Floor,
Range[Length[data]]}]
], First[#1] == First[#2] &]

Tests:

- Correctness:

data = Table[Random[Integer, 14], {12}, {3}]

{{4, 1, 14}, {14, 4, 6}, {14, 3, 2}, {13, 9, 0}, {14, 11, 5}, {2, 8, 5},
{14,
7, 0}, {1, 0, 7}, {12, 1, 11}, {12, 5, 11}, {2, 11, 10}, {0, 10, 9}}

s1 = bin1[data]

{{{0, 10, 9}, {1, 0, 7}, {2, 8, 5}, {2, 11, 10}, {4, 1, 14}}, {{12, 1,
11}, {12, 5, 11}, {13, 9, 0}, {14, 3, 2}, {14, 4, 6}, {14, 7, 0}, {14,
11, 5}}}

s2 = bin2[data]

{{{4, 1, 14}, {2, 8, 5}, {1, 0, 7}, {2, 11, 10}, {0, 10, 9}}, {{14, 4,
6}, {14, 3, 2}, {13, 9, 0}, {14, 11, 5}, {14, 7, 0}, {12, 1, 11}, {12,
5, 11}}}

Sort /@ s2 == s1

True

- Timings

data = Table[Random[Integer, 50], {5000}, {3}];

(s1 = bin1[data]); // Timing // First

3.02 Second

(s2 = bin2[data]); // Timing // First

1.32 Second

We can quickly construct s1 from s2

Sort /@ s2; // Timing // First

0.05 Second
--
Allan
---------------------
Allan Hayes
Mathematica Training and Consulting
Leicester UK
www.haystack.demon.co.uk
hay at haystack.demon.co.uk
Voice: +44 (0)116 271 4198
Fax: +44 (0)870 164 0565

<Matt.Johnson at autolivasp.com> wrote in message
news:8d964f\$ie1 at smc.vnet.net...
>
>
> Dear Mathgroup:
>
> I have many large datasets of {x,y,z} data that I wish to break into small
data
> sets based on the value of x.  For example, the x value ranges from 0 to
100 and
> I want ot break up the data into 20 groups, from 0-5, 5-10, 10-15, etc.
There
> will be an unequal number of data points in each interval.  I have written
a
> routine based on several Do loops to do this and it works satisfactorily.
> However, I would think that there is a way to eliminate from the data set
the
> points that have already been placed in their appropriate intervals, or a
> routine that would "place" the point in the appropriate group, only having
to go
> through the datasets once.  Either of these options would speed up the
process.
> Currently the routine goes through each complete dataset as many times as
there
> are intervals created.  Here is the current code:
>
> Do[Do[Do[
>      If[ i-0.5 di<=dataset[j][[k,1]]<=i+0.5 di,
>      AppendTo[group[j,i],dataset[j][[k]] ]],
>      {k, Length[dataset[j]]}],
>      {i, imin, imax, di}], {j,jmax}]
>
> There are j datasets with k points in each dataset.  i serves as the index
for
> the intervals, according to the x value, with an interval size of di.
> It creates (imax-imin)/di intervals in each dataset.
>
> Thanks for any help
>
> Matt Johnson
>
>
>

```

• Prev by Date: RE: Clear or remove definitions including Subscripts
• Next by Date: Re: A 3D-list-plot problem
• Previous by thread: RE: breaking up lists into intervals
• Next by thread: RE: Chop? Programming Challenge!!