Re: Fast way to select those elements from a list that are in another

• To: mathgroup at smc.vnet.net
• Subject: [mg86783] Re: Fast way to select those elements from a list that are in another
• From: Szabolcs Horvát <szhorvat at gmail.com>
• Date: Thu, 20 Mar 2008 05:50:24 -0500 (EST)
• Organization: University of Bergen
• References: <frt574\$sj8\$1@smc.vnet.net> <47E23A75.7030909@gmail.com>

```Szabolcs Horvát wrote:
> Andrew Moylan wrote:
>> Two lists of integers:
>>
>> {a, b} = Table[RandomInteger[10000, 1000], {2}];
>>
>> Which elements from a are in b?
>>
>> Select[a, MemberQ[b, #] &]; // Timing
>>>> {0.351, Null}
>>
>> It takes about 0.351 seconds (on my slow laptop). Specialised
>> functions for related tasks, such as Intersection[a, b], are much
>> faster:
>>
>> Intersection[a, b]; // Timing
>>>> {0., Null}
>>
>> Using Intersection, here's a somewhat faster way to select those
>> elements from a that are in b:
>>
>> With[
>>    {c = Intersection[a, b]},
>>    Select[a, MemberQ[c, #] &]
>>    ]; // Timing
>>>> {0.09, Null}
>>
>> Is there a better, faster way to do this?
>>
>
> Hi Andrew,
>
> You did not formulate the problem precisely.  There are two differences
> between c and d = Select[a, MemberQ[c, #] &.  One is that c is sorted,
> while d is not, and the other is that duplicates are removed from c.
> Which of these two are important?
>
> If only one of the two is important then I can come up with much faster
> solutions, but if *both* are important then the problem becomes more
> difficult.
>

Actually I posted too soon.  There is a fast way to achieve this, though
whether it is really fast or not depends on the p and q parameters below:

In[1]:= {p, q} = {10000, 6000};

In[2]:= {a, b} = Table[RandomInteger[p, q], {2}];

In[3]:= Timing[unsortedIntersection1 = Select[a, MemberQ[b, #] &];]

Out[3]= {6.391, Null}

In[4]:= Timing[
unsortedIntersection2 = With[{c = Intersection[a, b]},
Select[a, MemberQ[c, #] &]];
]

Out[4]= {2.109, Null}

In[5]:= Timing[
unsortedIntersection3 =
a /. Dispatch[# -> Sequence[] & /@ Complement[a, b]];]

Out[5]= {0.015, Null}

In[6]:= unsortedIntersection1 === unsortedIntersection2 === \
unsortedIntersection3

Out[6]= True

Szabolcs

```

• Prev by Date: Re: floating point issue
• Next by Date: Re: Fast way to select those elements from a list that are in another
• Previous by thread: Fast way to select those elements from a list that are in another
• Next by thread: Re: Fast way to select those elements from a list that are in another