Re: breaking up lists into intervals

*To*: mathgroup at smc.vnet.net*Subject*: [mg23092] Re: [mg23074] breaking up lists into intervals*From*: Daniel Lichtblau <danl at bank.wolfram.com>*Date*: Sun, 16 Apr 2000 00:37:33 -0400 (EDT)*References*: <200004150700.DAA18470@smc.vnet.net>*Sender*: owner-wri-mathgroup at wolfram.com

Matt.Johnson at autolivasp.com wrote: > > Dear Mathgroup: > > I have many large datasets of {x,y,z} data that I wish to break into small data > sets based on the value of x. For example, the x value ranges from 0 to 100 and > I want ot break up the data into 20 groups, from 0-5, 5-10, 10-15, etc. There > will be an unequal number of data points in each interval. I have written a > routine based on several Do loops to do this and it works satisfactorily. > However, I would think that there is a way to eliminate from the data set the > points that have already been placed in their appropriate intervals, or a > routine that would "place" the point in the appropriate group, only having to go > through the datasets once. Either of these options would speed up the process. > Currently the routine goes through each complete dataset as many times as there > are intervals created. Here is the current code: > > Do[Do[Do[ > If[ i-0.5 di<=dataset[j][[k,1]]<=i+0.5 di, > AppendTo[group[j,i],dataset[j][[k]] ]], > {k, Length[dataset[j]]}], > {i, imin, imax, di}], {j,jmax}] > > There are j datasets with k points in each dataset. i serves as the index for > the intervals, according to the x value, with an interval size of di. > It creates (imax-imin)/di intervals in each dataset. > > Thanks for any help > > Matt Johnson Here is one method. data = Table[Random[Real,100], {30}, {3}]; Split[Sort[data], Floor[#1[[1]]/5]==Floor[#2[[1]]/5]&] There are asymptotically faster methods (avoiding Sort) but in practice you will probably not get better speed from them. For example, assuming you know the maximum size is 100 (so there are 20 bins) you can do as below. bin[data_] := Module[ {bins=Table[{},{20}], indx, j, len=Length[data]}, Do [ indx = Floor[data[[j,1]]/5]+1; bins[[indx]] = {bins[[indx]], data[[j]]}, {j, len} ]; Map[Partition[Flatten[#],3]&, bins] ] Note that this does not reorder within bins, unlike the method that uses Sort. Assuming your data is all real-valued, as above, you could likely gain speed by using Compile. Daniel Lichtblau Wolfram Research

**References**:**breaking up lists into intervals***From:*Matt.Johnson@autolivasp.com