Re: newbie list question
- To: mathgroup at smc.vnet.net
- Subject: [mg115100] Re: newbie list question
- From: Ray Koopman <koopman at sfu.ca>
- Date: Thu, 30 Dec 2010 19:06:11 -0500 (EST)
As I said in my Dec 29 post, firstPositions2 ignores repeats in the second list. firstPositions3 fixes the problem but is slower. Length@Union[l1 = Table[Random[Integer,{1, 10^6}], {5*10^5}]] Length@Union[l2 = Table[Random[Integer,{1, 10^4}], { 10^3}]] Timing@Length[r1 = firstPositions2[l1, l2]] Timing@Length[r2 = firsts [l1, l2]] Timing@Length[r3 = firstPositions3[l1, l2]] {r1 === r2, r1 === r3, r2 === r3} 393473 952 {3.92 Second,1000} {0.48 Second,1000} {8.53 Second,1000} {False,False,True} Length@Union[l1 = Table[Random[Integer,{1, 10^6}], {5*10^5}]] Length@Union[l2 = Table[Random[Integer,{1, 10^6}], { 10^3}]] Timing@Length[r1 = firstPositions2[l1, l2]] Timing@Length[r2 = firsts [l1, l2]] Timing@Length[r3 = firstPositions3[l1, l2]] {r1 === r2, r1 === r3, r2 === r3} 393371 1000 {3.92 Second,1000} {0.49 Second,1000} {8.1 Second,1000} {True,True,True} ----- DrMajorBob <btreat1 at austin.rr.com> wrote: > firsts[l1_, l2_] := Normal[SparseArray[Automatic, {Max[l1,l2]}, > 0, {1, {{0, Length[l1]}, List /@ l1}, Range[Length[l1]]}][[l2]]] > > l1 = RandomInteger[{1, 10^6}, 5*10^5]; > l2 = RandomInteger[{1, 10^4}, 10^3]; > r1 = firstPositions2[l1, l2]; // Timing > r2 = firsts[l1, l2]; // Timing > r1 === r2 > > {1.33669, Null} > > {0.089459, Null} > > False > > Bobby > > On Thu, 30 Dec 2010 03:12:17 -0600, Ray Koopman <koopman at sfu.ca> wrote: >> At 3:00 am on Dec 29, 2010, Carl Woll <carlw at wolfram.com> wrote: >>> On 12/28/2010 5:53 AM, Daniel Lichtblau wrote: >>>> ----- Original Message ----- >>>> >>>>> From: "Gareth Edwards"<gareth.edwards at cubicmotion.com> >>>>> To: mathgroup at smc.vnet.net >>>>> Sent: Sunday, December 26, 2010 3:02:23 AM >>>>> Subject: newbie list question >>>>> Hi, >>>>> >>>>> Liking Mathematica a lot, but struggling with the early part of >>>>> the learning curve, i.e. don't know what I don't know... >>>>> >>>>> What would be the neatest syntax for finding the first location >>>>> of elements from one list in another? For example: >>>>> >>>>> listA = { 4,5,8,2,6,4 } >>>>> listB = { 8,4,2 } >>>>> >>>>> I would like a function to return {3,1,4} in this case (the first >>>>> location in A of each element in B) >>>>> >>>>> Many thanks! >>>> >>>> Neatest is subject to debate. A straightforward and simple method >>>> is to Map Position (of element in first list) over the second list. >>>> >>>> firstPositions1[l1_, l2_] := Flatten[Map[Position[l1, #, 1, 1]&, l2]] >>>> >>>> This will be fine provided the second list is not too large; it >>>> scales as m1*m2 where mj is the length of list j. A method that is >>>> slower per element of list2, but scales as O(m1+m2), is shown below. >>>> >>>> firstPositions2[l1_, l2_] := Module[ >>>> {stor, elem, tab}, >>>> tab = ConstantArray[0, Length[l2]]; >>>> MapIndexed[If[! IntegerQ[stor[#1]], stor[#1] = #2[[1]]]&, l2]; >>>> Do[ >>>> elem = stor[l1[[j]]]; >>>> If[IntegerQ[elem], >>>> stor[l1[[j]]] = False; >>>> tab[[elem]] = j; >>>> ], {j, Length[l1]}]; >>>> tab] >>>> >>>> Quick test that these work. >>> >>>> listA = {4, 5, 8, 2, 6, 4}; >>>> listB = {8, 4, 2}; >>>> >>>> In[85]:= firstPositions1[listA, listB] == >>>> firstPositions2[listA, listB] == >>>> IntegerDigits[IntegerPart[100*Pi]] >>>> >>>> Out[85]= True >>>> >>>> If the second list has 10 or so elements, the second method is >>>> noticeably slower than the first. But it does in fact scale much >>>> better. Here we compare speed when second list is 1000 elements >>>> (not necessarily unique). >>>> >>>> l1 = RandomInteger[10^6, 5*10^5]; >>>> l2 = RandomInteger[10^6, 10^3]; >>>> >>>> In[73]:= Timing[pos1 = firstPositions1[l1, l2];] >>>> Out[73]= {68.11000000000001, Null} >>>> >>>> In[74]:= Timing[pos2 = firstPositions2[l1, l2];] >>>> Out[74]= {1.933999999999969, Null} >>>> >>>> Notice there is a difference in behavior beyond speed. If the second >>>> list contains values not found in the first, the second result will >>>> have zeros in those corresponding slots. I'm guessing this is >>>> probably desirable behavior, as compared to omitting them altogether >>>> (since you won't otherwise know what element in result corresponds >>>> to what element in second list). >>>> >>>> In[75]:= pos1 == DeleteCases[pos2, 0] >>>> Out[75]= True >>>> >>>> In[76]:= Length[pos1] >>>> Out[76]= 379 >>>> >>>> Daniel Lichtblau >>>> Wolfram Research >>> >>> If we assume that all of the list elements are positive integers, >>> then the following is much faster: >>> >>> firsts[l1_, l2_] := Normal[SparseArray[Automatic, {Max[l1]}, 0, >>> {1, {{0, Length[l1]}, List /@ l1}, Range[Length[l1]]}][[l2]]] >>> >>> Example: >>> >>> l1 = RandomInteger[{1, 10^6}, 5*10^5]; >>> l2 = RandomInteger[{1, 10^6}, 10^3]; >>> >>> In[130]:= r1 = firstPositions2[l1, l2]; // Timing >>> r2 = firsts[l1, l2]; // Timing >>> r1 === r2 >>> >>> Out[130]= {2.387, Null} >>> >>> Out[131]= {0.109, Null} >>> >>> Out[132]= True >>> >>> Carl Woll >>> Wolfram Research >> >> If Max[l2] can be > Max[l1] then we need >> >> firsts[l1_, l2_] := Normal[SparseArray[Automatic, {Max[l1,l2]}, >> 0, {1, {{0, Length[l1]}, List /@ l1}, Range[Length[l1]]}][[l2]]] > > -- > DrMajorBob at yahoo.com