Re: Fast way to select those elements from a list that are in another
- To: mathgroup at smc.vnet.net
- Subject: [mg86783] Re: Fast way to select those elements from a list that are in another
- From: Szabolcs Horvát <szhorvat at gmail.com>
- Date: Thu, 20 Mar 2008 05:50:24 -0500 (EST)
- Organization: University of Bergen
- References: <frt574$sj8$1@smc.vnet.net> <47E23A75.7030909@gmail.com>
Szabolcs Horvát wrote:
> Andrew Moylan wrote:
>> Two lists of integers:
>>
>> {a, b} = Table[RandomInteger[10000, 1000], {2}];
>>
>> Which elements from a are in b?
>>
>> Select[a, MemberQ[b, #] &]; // Timing
>>>> {0.351, Null}
>>
>> It takes about 0.351 seconds (on my slow laptop). Specialised
>> functions for related tasks, such as Intersection[a, b], are much
>> faster:
>>
>> Intersection[a, b]; // Timing
>>>> {0., Null}
>>
>> Using Intersection, here's a somewhat faster way to select those
>> elements from a that are in b:
>>
>> With[
>> {c = Intersection[a, b]},
>> Select[a, MemberQ[c, #] &]
>> ]; // Timing
>>>> {0.09, Null}
>>
>> Is there a better, faster way to do this?
>>
>
> Hi Andrew,
>
> You did not formulate the problem precisely. There are two differences
> between c and d = Select[a, MemberQ[c, #] &. One is that c is sorted,
> while d is not, and the other is that duplicates are removed from c.
> Which of these two are important?
>
> If only one of the two is important then I can come up with much faster
> solutions, but if *both* are important then the problem becomes more
> difficult.
>
Actually I posted too soon. There is a fast way to achieve this, though
whether it is really fast or not depends on the p and q parameters below:
In[1]:= {p, q} = {10000, 6000};
In[2]:= {a, b} = Table[RandomInteger[p, q], {2}];
In[3]:= Timing[unsortedIntersection1 = Select[a, MemberQ[b, #] &];]
Out[3]= {6.391, Null}
In[4]:= Timing[
unsortedIntersection2 = With[{c = Intersection[a, b]},
Select[a, MemberQ[c, #] &]];
]
Out[4]= {2.109, Null}
In[5]:= Timing[
unsortedIntersection3 =
a /. Dispatch[# -> Sequence[] & /@ Complement[a, b]];]
Out[5]= {0.015, Null}
In[6]:= unsortedIntersection1 === unsortedIntersection2 === \
unsortedIntersection3
Out[6]= True
Szabolcs