Re: position of sequence of numbers in list

*To*: mathgroup at smc.vnet.net*Subject*: [mg106994] Re: [mg106976] position of sequence of numbers in list*From*: Leonid Shifrin <lshifr at gmail.com>*Date*: Sun, 31 Jan 2010 05:56:00 -0500 (EST)*References*: <201001301212.HAA25132@smc.vnet.net>

Hi again, In my first post one of the solutions (the compiled function) contained a bug: In[1]:= posf[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}] Out[1]= -1 which was because it ignores all but the first candidate sequences in the case when they overlap. Here is a modified one which is (hopefully) correct, if not as fast: posf2= Compile[{{lst,_Integer,1},{target,_Integer,1}}, Module[{i=1,len =Length[target],lstlen = Length[lst]}, While[i<=lstlen-len+1&&Take[lst,{i,len+i-1}]!=target,i++]; If[i>lstlen-len+1,-1,i]]] In[3]:= posf2[{1, 2, 3, 4, 4, 5, 6, 7}, {4, 5, 6}] Out[3]= 5 This one will still be much superior to Partition-based implementation for cases when you can expect the sequence of interest to appear rather early in the list. Anyway, sorry for the confusion with the buggy version. By the way, should you wish to stick to Partition-based implementation, I think it is fair to mention that for small sequence sizes and partitioning with a shift 1, *and* when you have a list already in the packed array representation (which is possible when your numbers are say all integers or all reals, but not a mix), one can implement a more efficient version than the built-in Partition: Clear[myPartition]; myPartition[x_List, size_Integer] := With[{len = Length[x]}, Transpose@Table[x[[i ;; len - size + i]], {i, 1, size}]]; In[4]:= largeTestList = RandomInteger[{1, 15}, 3000000]; In[5]:= (pt = Partition[largeTestList, 2, 1]); // Timing Out[5]= {0.521, Null} In[6]:= (mpt = myPartition[largeTestList , 2]); // Timing Out[6]= {0.17, Null} In[7]:= pt == mpt Out[7]= True The built-in Partition will start winning when the partitioning size will be around 30-50, so for long sequences using a built-in is better. If your list is not packed, built-in Partition will be a few times faster even for small partitioning sizes, so this will then be a de-optimization. You can attempt to convert your list to packed array with Developer`ToPackedArray (note that it does not issue any messages in case it is unable to do this), and check that your list is packed with Developer`PackedArrayQ. Likewise, you can implement your own Position-like function aimed at exactly this problem, which, under similar requirements of your list being a packed array of numbers, will be better than the built-in Position in most cases: In[8]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1, 1] // Timing Out[8]= {0.27, {{41940}}} In[9]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6}, 1] // Timing Out[9]= {0.09, {41940}} In[10]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}, 1, 2] // Timing Out[10]= {0.301, {{41940}, {228293}}} In[11]:= myPosition[myPartition[largeTest , 4], {3, 4, 5, 6}, 2] // Timing Out[11]= {0.311, {41940, 228293}} In[12]:= Position[Partition[largeTest , 4, 1], {3, 4, 5, 6}] // Timing Out[12]= {0.55, {{41940}, {228293}, {269300}}} In[13]:= myPosition[ myPartition[largeTest , 4], {3, 4, 5, 6}, -1] // Timing Out[13]= {0.411, {41940, 228293, 269300}} where the last argument -1 is a convention to return all results. Regards, Leonid On Sat, Jan 30, 2010 at 4:12 AM, JB <jkerrb at gmail.com> wrote: > Hi, > > What is the most efficient way to find the position of the beginning > of a sequence of numbers from a list? > > I found a couple of ways: > > find 3,4 in list={1,2,3,4,5}; > > 1. pos=Intersection[Position[list,3],(Position[list,4])+1] > > 2. pos=Position[Partition[list,2,1],{3,4}] > > Are there other ways to do this? > What is the best way when dealing with large lists? > > Thanks, > JB > >

**References**:**position of sequence of numbers in list***From:*JB <jkerrb@gmail.com>