MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Function pure for Select

  • To: mathgroup at smc.vnet.net
  • Subject: [mg93153] Re: Function pure for Select
  • From: MattAd <adereth at gmail.com>
  • Date: Wed, 29 Oct 2008 05:48:40 -0500 (EST)
  • References: <ge5qk4$f0o$1@smc.vnet.net>

On Oct 27, 9:41 pm, Bob Hanlon <hanl... at cox.net> wrote:
> One of the reasons for the difference in timings is that the different ap=
proaches are not all solving the same problem, that is, we interpretted the=
 original posters request ("extend this function to a list of values") diff=
erently as to the manner of extension. The first and third method select al=
l rows that start with any of the target values. The second approach sequen=
tially selects the rows that start with each of the target values. The resu=
lts select all of the same rows; however, in the second case the result is =
grouped by first value. Whether this grouping is desired or intended by the=
 original poster I do not know.
>
> A shorter example is shown to view the differences.
>
> data = Table[{RandomInteger[100], RandomInteger[100]}, {75}];
> targets = Table[i, {i, 10}];
>
> t1 = Timing[s1 = Select[data, MemberQ[targets, First[#]] &];]
>
> {0.00021,Null}
>
> f[x_] := Select[data, First[#] == x &];
>
> t2 = Timing[s2 = f /@ targets;]
>
> {0.000762,Null}
>
> t3 = Timing[s3 = Cases[data, {Alternatives @@ targets, _}];]
>
> {0.000075,Null}
>
> s1 == s3
>
> True
>
> s1
>
> {{2, 13}, {5, 49}, {9, 6}, {1, 49}, {3, 84}, {4, 68},
>    {2, 91}, {1, 92}, {2, 9}, {3, 67}, {4, 91}, {4, 64},
>    {5, 63}}
>
> s2
>
> {{{1, 49}, {1, 92}}, {{2, 13}, {2, 91}, {2, 9}},
>    {{3, 84}, {3, 67}}, {{4, 68}, {4, 91}, {4, 64}},
>    {{5, 49}, {5, 63}}, {}, {}, {}, {{9, 6}}, {}}
>
> Bob Hanlon
>
> ---- MattAd <ader... at gmail.com> wrote:
>
> =============
> On Oct 25, 3:02 am, Bob Hanlon <hanl... at cox.net> wrote:
>
>
>
> > Split[Sort[Select[data,
> >    MemberQ[{30, 45, 50, 66}, #[[1]]] &]],
> >  #1[[1]] == #2[[1]] &]
>
> > However, it is much easier to use a helper function:
>
> > f[x_] := Select[data, #[[1]] == x &]
>
> > f /@ {30, 45, 50, 66}
>
> > Bob Hanlon
>
> > ---- Miguel <misv... at gmail.com> wrote:
>
> > =============
> > Hi al,
>
> > How can I to write a function pure to extract all the first rows of
> > collection of data, applied to a list?.
>
> > For example,
>
> > Select[data,First[#]==30&]
>
> > This function extracts all rows which first element is equal to 30.
> > Well, I want to extend this function to a list of values
> > {30,45,50,66}.
>
> > Thanks
>
> > --
>
> > Bob Hanlon
>
> Here's a look at a few approaches, with timings on my machine:
>
> data = Table[{RandomInteger[1000], RandomInteger[1000]}, {100000}];
> targets = Table[i, {i, 250}];
>
> Select[data, MemberQ[targets, First[#]] &]
> ...takes 2.886 seconds
>
> f[x_] := Select[data, First[#] == x &];
> f /@ targets
> ...takes 93.616 seconds
>
> Cases[data, {Alternatives@@targets, _}]
> ...takes 1.061 seconds
>
> IsTarget[_] := False;
> Scan[(IsTarget[#] = True) &, targets];
> Select[data, IsTarget[First[#]] &]
> ...takes 0.452 seconds

That's a fair point.  If grouping by the first value is desired, it'll
still be much faster to use the look-up function approach and then do
the grouping:

Split[
 SortBy[
  Select[data, IsTarget[First[#]] &],
  First],
 First[#1] == First[#2] &
]
...averages 0.739 seconds over 20 tries

Depending on the size of the data set and number of target values,
Reap can be even faster for the grouping part:

Last@
 Reap[
  Scan[
   If[
     IsTarget[First[#]],
     Sow[#, First[#]]] &,
   data]
  ]
...averages 0.678 seconds over 20 tries


  • Prev by Date: Re: Is there a way to make Mathematica commands and functions
  • Next by Date: Re: Two questions about DSolve
  • Previous by thread: Re: Re: Function pure for Select
  • Next by thread: Feature request: NIntegrate | Warning InterpolatingFunction::dmvali