Re: newbie list question
- To: mathgroup at smc.vnet.net
- Subject: [mg115075] Re: newbie list question
- From: DrMajorBob <btreat1 at austin.rr.com>
- Date: Thu, 30 Dec 2010 04:11:44 -0500 (EST)
That fails, and Part throws an error, when l2's range is greater than that of l1: l1 = RandomInteger[{1, 10^4}, 5*10^5]; l2 = RandomInteger[{1, 10^5}, 10^3]; r1 = firstPositions2[l1, l2]; // Timing r2 = firsts[l1, l2]; // Timing r1 === r2 {1.35344, Null} {0.093815, Null} False It fails without a Part error, if l1's range is greater: l1 = RandomInteger[{1, 10^5}, 5*10^5]; l2 = RandomInteger[{1, 10^4}, 10^3]; r1 = firstPositions2[l1, l2]; // Timing r2 = firsts[l1, l2]; // Timing r1 === r2 {1.35791, Null} {0.051834, Null} False When the ranges are equal, all is fine: l1 = RandomInteger[{1, 10^6}, 5*10^5]; l2 = RandomInteger[{1, 10^6}, 10^3]; r1 = firstPositions2[l1, l2]; // Timing r2 = firsts[l1, l2]; // Timing r1 === r2 {1.29281, Null} {0.102205, Null} True Bobby On Wed, 29 Dec 2010 04:58:53 -0600, Carl Woll <carlw at wolfram.com> wrote: > On 12/28/2010 5:53 AM, Daniel Lichtblau wrote: >> ----- Original Message ----- >> >>> From: "Gareth Edwards"<gareth.edwards at cubicmotion.com> >>> To: mathgroup at smc.vnet.net >>> Sent: Sunday, December 26, 2010 3:02:23 AM >>> Subject: [mg114986] newbie list question >>> Hi, >>> >>> Liking Mathematica a lot, but struggling with the early part of the >>> learning curve, i.e. don't know what I don't know... >>> >>> What would be the neatest syntax for finding the first location of >>> elements from one list in another? For example: >>> >>> listA = { 4,5,8,2,6,4 } >>> listB = { 8,4,2 } >>> >>> >>> I would like a function to return {3,1,4} in this case (the first >>> location in A of each element in B) >>> >>> Many thanks! >>> >> Neatest is subject to debate. A straightforward and simple method is to >> Map Position (of element in first list) over the second list. >> >> firstPositions1[l1_, l2_] := Flatten[Map[Position[l1, #, 1, 1]&, l2]] >> >> This will be fine provided the second list is not too large; it scales >> as m1*m2 where mj is the length of list j. A method that is slower per >> element of list2, but scales as O(m1+m2), is shown below. >> >> firstPositions2[l1_, l2_] := Module[ >> {stor, elem, tab}, >> tab = ConstantArray[0, Length[l2]]; >> MapIndexed[If[! IntegerQ[stor[#1]], stor[#1] = #2[[1]]]&, l2]; >> Do[ >> elem = stor[l1[[j]]]; >> If[IntegerQ[elem], >> stor[l1[[j]]] = False; >> tab[[elem]] = j; >> ], {j, Length[l1]}]; >> tab] >> >> Quick test that these work. >> >> listA = {4, 5, 8, 2, 6, 4}; >> listB = {8, 4, 2}; >> >> In[85]:= firstPositions1[listA, listB] == >> firstPositions2[listA, listB] == >> IntegerDigits[IntegerPart[100*Pi]] >> >> Out[85]= True >> >> If the second list has 10 or so elements, the second method is >> noticeably slower than the first. But it does in fact scale much >> better. Here we compare speed when second list is 1000 elements (not >> necessarily unique). >> >> l1 = RandomInteger[10^6, 5*10^5]; >> l2 = RandomInteger[10^6, 10^3]; >> >> In[73]:= Timing[pos1 = firstPositions1[l1, l2];] >> Out[73]= {68.11000000000001, Null} >> >> In[74]:= Timing[pos2 = firstPositions2[l1, l2];] >> Out[74]= {1.933999999999969, Null} >> >> Notice there is a difference in behavior beyond speed. If the second >> list contains values not found in the first, the second result will >> have zeros in those corresponding slots. I'm guessing this is probably >> desirable behavior, as compared to omitting them altogether (since you >> won't otherwise know what element in result corresponds to what element >> in second list). >> >> In[75]:= pos1 == DeleteCases[pos2, 0] >> Out[75]= True >> >> In[76]:= Length[pos1] >> Out[76]= 379 >> >> Daniel Lichtblau >> Wolfram Research >> >> >> > If we assume that all of the list elements are positive integers, then > the following is much faster: > > firsts[l1_, l2_] := Normal[SparseArray[Automatic, {Max[l1]}, 0, {1, > {{0, Length[l1]}, List /@ l1}, Range[Length[l1]]}][[l2]]] > > Example: > > l1 = RandomInteger[{1, 10^6}, 5*10^5]; > l2 = RandomInteger[{1, 10^6}, 10^3]; > > In[130]:= r1 = firstPositions2[l1, l2]; // Timing > r2 = firsts[l1, l2]; // Timing > r1 === r2 > > Out[130]= {2.387, Null} > > Out[131]= {0.109, Null} > > Out[132]= True > > Carl Woll > Wolfram Research > > -- DrMajorBob at yahoo.com