Re: Fast List-Selection
- To: mathgroup at smc.vnet.net
- Subject: [mg19932] Re: [mg19880] Fast List-Selection
- From: "Andrzej Kozlowski" <andrzej at tuins.ac.jp>
- Date: Wed, 22 Sep 1999 04:11:15 -0400
- Sender: owner-wri-mathgroup at wolfram.com
I think a few additional remarks are in order concerning this question and my reply. First of all I wrote all my programs assuming that we know three is only one sequence of exactly seven (in the case which I considered 6) successive identical integers which we are looking for. This was not actually what was asked but I guess I was just thinking about m Pi example rather than about what Hans actually asked. Of my 4 programs only the slowest (findsequence[2]) fids all such sequences if there is more than one. In the cases of findsequence[3] and findsequence[4] however this can be easily fixed. Here are two corrected versions, this time written for the case of seven consequtive identical integers: findsequence[3][l_] := Module[{m = Split[l], mark}, (# + Table[6*(i - 1), {i, 1, Length[#]}]) &@ Position[Flatten[m /. Thread[Cases[m, _?(Length[#] == 7 &)] -> mark]], mark]] findsequence[4][l_] := Module[{m = Split[l]}, Map[Length[Flatten[Take[m, # - 1]]] + 1 &, Flatten[Map[Position[m, #] &, Select[m, Length[#] == 7 &]]]]] At this point of course a second problem emerges. Namely, should a sequence of 8 consecutive identical integers count as two sequences of 7 consecutive integers or none? If the former than Split is not really the appropriate function to use. So to make my task easier I shall assume that 7 means exactly seven and not eight but of course Hans's program as well as my findsequence[2] and findsequence[5] make the other assumption. Oh, well. Clearly findsequence[5], which was the fastest in my earlier test is not suitable for the purpose of finding more than one such sequence: it's performance depends on the fact that unlike the other programs it exits the moment it found the sequence it was looking for. Even then it's performance will depend very much on whether the sequence of identical integers is near the begining or near the end of the whole sequence. It may be interesting to look at what hapens in the extreme cases: with l = Join[{1, 1, 1, 1, 1, 1, 1}, Table[Random[Integer, {1, 9}], {10000}]]; I get: In[152]:= Table[findsequence[i][l] // Timing, {i, 1, 5}] 1 Out[152]= {{1.41667 Second, Null}, {0.0166667 Second, {{1}}}, {0.883333 Second, {{1}}}, {0.766667 Second, {1}}, {0.0166667 Second, 1}} with l = Join[Table[Random[Integer, {1, 9}], {10000}], {1, 1, 1, 1, 1, 1, 1}]; the results are of course quite different: In[153]:= Table[findsequence[i][l] // Timing, {i, 1, 5}] 10001 Out[153]= {{1.41667 Second, Null}, {30.7667 Second, {{10001}}}, {0.883333 Second, {{10001}}}, \ {0.816667 Second, {10001}}, {2.13333 Second, 10001}} So really to decide which is the fastest one would have to study the average case. I do not intend to try to do this but here is just a single random run: In[154]:= l = Flatten[ Insert[Table[Random[Integer, {1, 9}], {10000}], {1, 1, 1, 1, 1, 1, 1}, Random[Integer, {1, 10000}]]]; In[155]:= Table[findsequence[i][l] // Timing, {i, 1, 5}] 3734 Out[155]= {{1.18333 Second, Null}, {11.3333 Second, {{3734}}}, {0.866667 Second, {{3734}}}, {0.75 Second, {3734}}, {0.733333 Second, 3734}} Of course I was lucky here I did not get a sequence of 8 identical integers for indsequence[3] and findsequence[4] would have ignored it, findsequence[1] and findsequence[2] counted it twice and findsequence[5] just once. Andrzej Kozlowski Toyama International University JAPAN http://sigma.tuins.ac.jp http://eri2.tuins.ac.jp ---------- >From: "Andrzej Kozlowski" <andrzej at tuins.ac.jp> To: mathgroup at smc.vnet.net >To: Hans Havermann <haver at total.net>, mathgroup at smc.vnet.net >Subject: [mg19932] Re: [mg19880] Fast List-Selection >Date: Tue, Sep 21, 1999, 6:19 AM > > I have managed to produce four additional examples, three of which seem to > be faster than yours. I will change the search to 6 rather than 7 > consequtive identical elements because I want to use for my testing a > famous example: > > In[1]:= > l = RealDigits[N[Pi, 10000]][[1]]; > > Hear are the functions I will test, beginning with yours: > > In[2]:= > findsequence[1][l_] := Do[If[Count[t = Take[l, {i, i + 5}], t[[1]]] == 6, > Print[i]], {i, 1, Length[l] - 5}] > > In[3]:= > findsequence[2][l_] := > Position[l /. {a___, y_, y_, y_, y_, y_, y_, b___} -> {a, mark, b}, mark, 1] > > In[4]:= > findsequence[3][l_] := > Module[{m = Split[l], mark}, > Position[Flatten[m /. Cases[m, _?(Length[#] == 6 &)][[1]] -> mark], mark, > 1]] > > In[5]:= > findsequence[4][l_] := > Module[{m = Split[l]}, > Length[Flatten[ > Take[m, Position[m, Select[m, Length[#] == 6 &][[1]]][[1, 1]] - > 1]]] + 1] > > In[6]:= > f[x_, x_, x_, x_, x_, x_] := 0; > f[y__] := 1; > g[l_, i_] := f[Apply[Sequence, Take[l, {i, i + 5}]]]; > findsequence[5][l_] := Scan[If[g[l, #] == 0, Return[#]] &, Range[Length[l]]]; > > Now the test: > > In[9]:= > Table[findsequence[i][l] // Timing, {i, 1, 5}] > From In[9]:= > 763 > Out[9]= > {{1.38333 Second, > Null}, {2.18333 Second, {{763}}}, {0.783333 Second, {{763}}}, {0.683333 > Second, 763}, {0.183333 Second, 763}} > > The last one wins by a big marigin. The programs using Split may do better > on other machines (I am using Mac PowerBook G3 ,233 mghz) because there > seems tobe something wrong with Split on the Mac where it doesn't scale > linearly with the zise of the input. > > Finally: In[10]:= > Take[l, {763, 763 + 5}] > Out[10]= > {9, 9, 9, 9, 9, 9} > -- > Andrzej Kozlowski > Toyama International University > JAPAN > http://sigma.tuins.ac.jp > http://eri2.tuins.ac.jp > > > ---------- >>From: Hans Havermann <haver at total.net> To: mathgroup at smc.vnet.net >>To: mathgroup at smc.vnet.net >>Subject: [mg19932] [mg19880] Fast List-Selection >>Date: Mon, Sep 20, 1999, 7:47 AM >> > >> I have a list 's' composed of a large number of (small) integers. I wish to >> search this list for instances of 7 consecutive, identical elements. >> >> My approach is: >> >> Do[If[Count[t = Take[s, {i, i + 6}], t[[1]]] == 7, >> Print[i]], {i, 1, Length[s] - 6}] >> >> Can anyone think of a *faster* way of doing this? >> >> >>