MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Fast way to select those elements from a list that are in another

  • To: mathgroup at smc.vnet.net
  • Subject: [mg86783] Re: Fast way to select those elements from a list that are in another
  • From: Szabolcs Horvát <szhorvat at gmail.com>
  • Date: Thu, 20 Mar 2008 05:50:24 -0500 (EST)
  • Organization: University of Bergen
  • References: <frt574$sj8$1@smc.vnet.net> <47E23A75.7030909@gmail.com>

Szabolcs Horvát wrote:
> Andrew Moylan wrote:
>> Two lists of integers:
>>
>> {a, b} = Table[RandomInteger[10000, 1000], {2}];
>>
>> Which elements from a are in b?
>>
>> Select[a, MemberQ[b, #] &]; // Timing
>>>> {0.351, Null}
>>
>> It takes about 0.351 seconds (on my slow laptop). Specialised
>> functions for related tasks, such as Intersection[a, b], are much
>> faster:
>>
>> Intersection[a, b]; // Timing
>>>> {0., Null}
>>
>> Using Intersection, here's a somewhat faster way to select those
>> elements from a that are in b:
>>
>> With[
>>    {c = Intersection[a, b]},
>>    Select[a, MemberQ[c, #] &]
>>    ]; // Timing
>>>> {0.09, Null}
>>
>> Is there a better, faster way to do this?
>>
> 
> Hi Andrew,
> 
> You did not formulate the problem precisely.  There are two differences 
> between c and d = Select[a, MemberQ[c, #] &.  One is that c is sorted, 
> while d is not, and the other is that duplicates are removed from c. 
> Which of these two are important?
> 
> If only one of the two is important then I can come up with much faster 
> solutions, but if *both* are important then the problem becomes more 
> difficult.
> 


Actually I posted too soon.  There is a fast way to achieve this, though 
whether it is really fast or not depends on the p and q parameters below:

In[1]:= {p, q} = {10000, 6000};

In[2]:= {a, b} = Table[RandomInteger[p, q], {2}];

In[3]:= Timing[unsortedIntersection1 = Select[a, MemberQ[b, #] &];]

Out[3]= {6.391, Null}

In[4]:= Timing[
  unsortedIntersection2 = With[{c = Intersection[a, b]},
     Select[a, MemberQ[c, #] &]];
  ]

Out[4]= {2.109, Null}

In[5]:= Timing[
  unsortedIntersection3 =
    a /. Dispatch[# -> Sequence[] & /@ Complement[a, b]];]

Out[5]= {0.015, Null}

In[6]:= unsortedIntersection1 === unsortedIntersection2 === \
unsortedIntersection3

Out[6]= True

Szabolcs


  • Prev by Date: Re: floating point issue
  • Next by Date: Re: Fast way to select those elements from a list that are in another
  • Previous by thread: Fast way to select those elements from a list that are in another
  • Next by thread: Re: Fast way to select those elements from a list that are in another