Re: Function pure for Select

*To*: mathgroup at smc.vnet.net*Subject*: [mg93153] Re: Function pure for Select*From*: MattAd <adereth at gmail.com>*Date*: Wed, 29 Oct 2008 05:48:40 -0500 (EST)*References*: <ge5qk4$f0o$1@smc.vnet.net>

On Oct 27, 9:41 pm, Bob Hanlon <hanl... at cox.net> wrote: > One of the reasons for the difference in timings is that the different ap= proaches are not all solving the same problem, that is, we interpretted the= original posters request ("extend this function to a list of values") diff= erently as to the manner of extension. The first and third method select al= l rows that start with any of the target values. The second approach sequen= tially selects the rows that start with each of the target values. The resu= lts select all of the same rows; however, in the second case the result is = grouped by first value. Whether this grouping is desired or intended by the= original poster I do not know. > > A shorter example is shown to view the differences. > > data = Table[{RandomInteger[100], RandomInteger[100]}, {75}]; > targets = Table[i, {i, 10}]; > > t1 = Timing[s1 = Select[data, MemberQ[targets, First[#]] &];] > > {0.00021,Null} > > f[x_] := Select[data, First[#] == x &]; > > t2 = Timing[s2 = f /@ targets;] > > {0.000762,Null} > > t3 = Timing[s3 = Cases[data, {Alternatives @@ targets, _}];] > > {0.000075,Null} > > s1 == s3 > > True > > s1 > > {{2, 13}, {5, 49}, {9, 6}, {1, 49}, {3, 84}, {4, 68}, > {2, 91}, {1, 92}, {2, 9}, {3, 67}, {4, 91}, {4, 64}, > {5, 63}} > > s2 > > {{{1, 49}, {1, 92}}, {{2, 13}, {2, 91}, {2, 9}}, > {{3, 84}, {3, 67}}, {{4, 68}, {4, 91}, {4, 64}}, > {{5, 49}, {5, 63}}, {}, {}, {}, {{9, 6}}, {}} > > Bob Hanlon > > ---- MattAd <ader... at gmail.com> wrote: > > ============= > On Oct 25, 3:02 am, Bob Hanlon <hanl... at cox.net> wrote: > > > > > Split[Sort[Select[data, > > MemberQ[{30, 45, 50, 66}, #[[1]]] &]], > > #1[[1]] == #2[[1]] &] > > > However, it is much easier to use a helper function: > > > f[x_] := Select[data, #[[1]] == x &] > > > f /@ {30, 45, 50, 66} > > > Bob Hanlon > > > ---- Miguel <misv... at gmail.com> wrote: > > > ============= > > Hi al, > > > How can I to write a function pure to extract all the first rows of > > collection of data, applied to a list?. > > > For example, > > > Select[data,First[#]==30&] > > > This function extracts all rows which first element is equal to 30. > > Well, I want to extend this function to a list of values > > {30,45,50,66}. > > > Thanks > > > -- > > > Bob Hanlon > > Here's a look at a few approaches, with timings on my machine: > > data = Table[{RandomInteger[1000], RandomInteger[1000]}, {100000}]; > targets = Table[i, {i, 250}]; > > Select[data, MemberQ[targets, First[#]] &] > ...takes 2.886 seconds > > f[x_] := Select[data, First[#] == x &]; > f /@ targets > ...takes 93.616 seconds > > Cases[data, {Alternatives@@targets, _}] > ...takes 1.061 seconds > > IsTarget[_] := False; > Scan[(IsTarget[#] = True) &, targets]; > Select[data, IsTarget[First[#]] &] > ...takes 0.452 seconds That's a fair point. If grouping by the first value is desired, it'll still be much faster to use the look-up function approach and then do the grouping: Split[ SortBy[ Select[data, IsTarget[First[#]] &], First], First[#1] == First[#2] & ] ...averages 0.739 seconds over 20 tries Depending on the size of the data set and number of target values, Reap can be even faster for the grouping part: Last@ Reap[ Scan[ If[ IsTarget[First[#]], Sow[#, First[#]]] &, data] ] ...averages 0.678 seconds over 20 tries