Re: Fast way to select those elements from a list that are in another
- To: mathgroup at smc.vnet.net
- Subject: [mg86783] Re: Fast way to select those elements from a list that are in another
- From: Szabolcs Horvát <szhorvat at gmail.com>
- Date: Thu, 20 Mar 2008 05:50:24 -0500 (EST)
- Organization: University of Bergen
- References: <frt574$sj8$1@smc.vnet.net> <47E23A75.7030909@gmail.com>
Szabolcs Horvát wrote: > Andrew Moylan wrote: >> Two lists of integers: >> >> {a, b} = Table[RandomInteger[10000, 1000], {2}]; >> >> Which elements from a are in b? >> >> Select[a, MemberQ[b, #] &]; // Timing >>>> {0.351, Null} >> >> It takes about 0.351 seconds (on my slow laptop). Specialised >> functions for related tasks, such as Intersection[a, b], are much >> faster: >> >> Intersection[a, b]; // Timing >>>> {0., Null} >> >> Using Intersection, here's a somewhat faster way to select those >> elements from a that are in b: >> >> With[ >> {c = Intersection[a, b]}, >> Select[a, MemberQ[c, #] &] >> ]; // Timing >>>> {0.09, Null} >> >> Is there a better, faster way to do this? >> > > Hi Andrew, > > You did not formulate the problem precisely. There are two differences > between c and d = Select[a, MemberQ[c, #] &. One is that c is sorted, > while d is not, and the other is that duplicates are removed from c. > Which of these two are important? > > If only one of the two is important then I can come up with much faster > solutions, but if *both* are important then the problem becomes more > difficult. > Actually I posted too soon. There is a fast way to achieve this, though whether it is really fast or not depends on the p and q parameters below: In[1]:= {p, q} = {10000, 6000}; In[2]:= {a, b} = Table[RandomInteger[p, q], {2}]; In[3]:= Timing[unsortedIntersection1 = Select[a, MemberQ[b, #] &];] Out[3]= {6.391, Null} In[4]:= Timing[ unsortedIntersection2 = With[{c = Intersection[a, b]}, Select[a, MemberQ[c, #] &]]; ] Out[4]= {2.109, Null} In[5]:= Timing[ unsortedIntersection3 = a /. Dispatch[# -> Sequence[] & /@ Complement[a, b]];] Out[5]= {0.015, Null} In[6]:= unsortedIntersection1 === unsortedIntersection2 === \ unsortedIntersection3 Out[6]= True Szabolcs