Re: Extracting some elements, members of another list
- To: mathgroup at smc.vnet.net
- Subject: [mg112418] Re: Extracting some elements, members of another list
- From: Daniel Lichtblau <danl at wolfram.com>
- Date: Wed, 15 Sep 2010 04:37:08 -0400 (EDT)
Nacho wrote:
> Hi!
>
> I would like your advice about this problem:
>
> I have a list of elements, consisting in phone numbers with some data:
>
> list1= { { phone1, data1}, {phone2, data2}, {phone3, data3} .... }
>
> And a list of phones, a subset of the phone numbers in list1
>
> list2= {phone2, phone3, phone7... }
>
> I'd like to extract the elements from list1 whose phone numbers are
> present in list 2, that is:
>
> result= { { phone2, data2}, {phone3, data3, {phone7, data7} .... }
>
> I've used this with small lists and it works fine:
>
> result = Select[list1, MemberQ[list2, #[[1]]] &];
>
> The problem is that now I would like to use it with big lists. list1
> is over 1.000.000 elements long and list2 is about 500.000 elements
> long. Ordering is not a problem, I could resort the lists.
>
> Any hint to extract this list faster? It seems to take a lot of time
> (estimation is about 5 hours and I have to do it repeatedly)
>
>
> Thanks!
Could do as follows.
overlapsNth[l1_,l2_,n_:1] := Module[
{h, res},
Do[h[l1[[j,n]]] = True, {j,Length[list1]}];
res = Reap[Do[If[TrueQ[h[l2[[j,n]]]], Sow[list2[[j]]]],
{j,Length[l2]}]][[2,1]];
Clear[h];
res
]
I've not thoroughly checked for correctness so you might want to do
that. As for speed, here is an example.
n = 6;
list1 = RandomInteger[10^9, {10^n,2}];
list2 = RandomInteger[10^9, {10^n,2}];
In[24]:= Timing[intersect = overlapsNth[list1,list2,1];]
Out[24]= {3.86441, Null}
In[25]:= Length[intersect]
Out[25]= 1033
Daniel Lichtblau
Wolfram Research