Re: More Efficient Method
- To: mathgroup at smc.vnet.net
- Subject: [mg105104] Re: [mg105076] More Efficient Method
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Sat, 21 Nov 2009 03:33:13 -0500 (EST)
- References: <200911201138.GAA03384@smc.vnet.net>
blamm64 wrote: > I have a couple of functions designed to poke a single hole, and to > poke multiple holes, in a one-level list: > > We define a function which, given the imported pressure data, finds > the subset of that pressure data excluding the pressure data points > between "targetL " and "targetU". > > In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? > NumericQ] := Select[data,(#<=targetL || #>=targetU &)] > > This function will pluck out multiple holes in the data list. > > In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, > tmp=data; > Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, > {i,Dimensions[tarList][[1]]}]; > tmp > ] > > The following works fine (big holes chosen not to give large result): > > In[7]:= datalist=Range[11,3411,10]; > > In[12]:= targetlist={{40, 1500},{1600,3300}}; > > In[13]:= resultdata=subsets[datalist,targetlist] > > Out[13]= > {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411} > > But if "datalist" happens to be very large, surely there is a (much) > more efficient method? > > I tried unsuccessfully to use pure functions with Select, but have a > somewhat nebulous feeling there's a pure function way of doing this > effectively much more efficiently. > > I know, I know: the above have no consistency checking. I also know > "subsets" could be used in place of "findsubset" just by replacing the > call of "findsubset" with the code of "findsubset" in "subsets". > >>From what I've seen on this forum there are some really experienced > people who might provide an efficient way of implementing the above. > > -Brian L. If you are working with integers then the method below should be fine. Otherwise you may need to "fuzzify" a bit differently. I use IntervalMemberQ to determine which elements in the data list to omit, and then does the selection using Select (I tried Pick, and it was perhaps a half a hair slower). subsets2[data_?VectorQ,tarList_?ListQ] := Module[ {intv=Apply[Interval,Map[#+{.5,-.5}&,tarList]]}, Select[data, !IntervalMemberQ[intv,#]&]] Here is a quick but slightly large test. datalist = RandomInteger[11000,100000]; targetlist = Table[{n,n+20}, {n,100,10000,100}]; In[47]:= Timing[resultdata = subsets[datalist,targetlist];] Out[47]= {14.4878, Null} In[48]:= Timing[resultdata2 = subsets2[datalist,targetlist];] Out[48]= {0.179973, Null} In[49]:= resultdata === resultdata2 Out[49]= True In[50]:= Length[resultdata2] Out[50]= 82596 Daniel Lichtblau Wolfram Research
- References:
- More Efficient Method
- From: blamm64 <blamm64@charter.net>
- More Efficient Method