Re: breaking up lists into intervals
- To: mathgroup at smc.vnet.net
- Subject: [mg23144] Re: breaking up lists into intervals
- From: "Allan Hayes" <hay at haystack.demon.co.uk>
- Date: Thu, 20 Apr 2000 03:21:03 -0400 (EDT)
- References: <8d964f$ie1@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Fred Simons and Daniel Lichblau both posted the solution bin1[data_] := Split[Sort[data], Floor[#1[[1]]/5] == Floor[#2[[1]]/5] &] Daniel remarked that some of the extra sorting might not be wanted. The following is rather more than twice as fast on my timings and bins without any extra sorting. The result from the bin1 can be quickly obtained. bin2[data_] := Part[data, #[[All, 2]]] & /@ Split[Sort[Transpose[{data[[All, 1]]/5 // Floor, Range[Length[data]]}] ], First[#1] == First[#2] &] Tests: - Correctness: data = Table[Random[Integer, 14], {12}, {3}] {{4, 1, 14}, {14, 4, 6}, {14, 3, 2}, {13, 9, 0}, {14, 11, 5}, {2, 8, 5}, {14, 7, 0}, {1, 0, 7}, {12, 1, 11}, {12, 5, 11}, {2, 11, 10}, {0, 10, 9}} s1 = bin1[data] {{{0, 10, 9}, {1, 0, 7}, {2, 8, 5}, {2, 11, 10}, {4, 1, 14}}, {{12, 1, 11}, {12, 5, 11}, {13, 9, 0}, {14, 3, 2}, {14, 4, 6}, {14, 7, 0}, {14, 11, 5}}} s2 = bin2[data] {{{4, 1, 14}, {2, 8, 5}, {1, 0, 7}, {2, 11, 10}, {0, 10, 9}}, {{14, 4, 6}, {14, 3, 2}, {13, 9, 0}, {14, 11, 5}, {14, 7, 0}, {12, 1, 11}, {12, 5, 11}}} Sort /@ s2 == s1 True - Timings data = Table[Random[Integer, 50], {5000}, {3}]; (s1 = bin1[data]); // Timing // First 3.02 Second (s2 = bin2[data]); // Timing // First 1.32 Second We can quickly construct s1 from s2 Sort /@ s2; // Timing // First 0.05 Second -- Allan --------------------- Allan Hayes Mathematica Training and Consulting Leicester UK www.haystack.demon.co.uk hay at haystack.demon.co.uk Voice: +44 (0)116 271 4198 Fax: +44 (0)870 164 0565 <Matt.Johnson at autolivasp.com> wrote in message news:8d964f$ie1 at smc.vnet.net... > > > Dear Mathgroup: > > I have many large datasets of {x,y,z} data that I wish to break into small data > sets based on the value of x. For example, the x value ranges from 0 to 100 and > I want ot break up the data into 20 groups, from 0-5, 5-10, 10-15, etc. There > will be an unequal number of data points in each interval. I have written a > routine based on several Do loops to do this and it works satisfactorily. > However, I would think that there is a way to eliminate from the data set the > points that have already been placed in their appropriate intervals, or a > routine that would "place" the point in the appropriate group, only having to go > through the datasets once. Either of these options would speed up the process. > Currently the routine goes through each complete dataset as many times as there > are intervals created. Here is the current code: > > Do[Do[Do[ > If[ i-0.5 di<=dataset[j][[k,1]]<=i+0.5 di, > AppendTo[group[j,i],dataset[j][[k]] ]], > {k, Length[dataset[j]]}], > {i, imin, imax, di}], {j,jmax}] > > There are j datasets with k points in each dataset. i serves as the index for > the intervals, according to the x value, with an interval size of di. > It creates (imax-imin)/di intervals in each dataset. > > Thanks for any help > > Matt Johnson > > >