       Re: Fast List-Selection

• To: mathgroup at smc.vnet.net
• Subject: [mg19932] Re: [mg19880] Fast List-Selection
• From: "Andrzej Kozlowski" <andrzej at tuins.ac.jp>
• Date: Wed, 22 Sep 1999 04:11:15 -0400
• Sender: owner-wri-mathgroup at wolfram.com

```I think a few additional remarks are in order concerning this question and
my reply. First of all I wrote all my programs assuming that we know three
is only one sequence of exactly seven (in the case which I considered  6)
successive identical integers which we are looking for. This was not
actually what was asked but I guess I was just thinking about m Pi example
rather than about what Hans actually asked. Of my 4 programs only the
slowest (findsequence) fids all such sequences if there is more than one.
In the cases of findsequence and findsequence however this can be
easily fixed. Here are two corrected versions, this time written for the
case of seven consequtive identical integers:

findsequence[l_] :=
Module[{m = Split[l], mark}, (# + Table[6*(i - 1), {i, 1, Length[#]}]) &@
Position[Flatten[m /. Thread[Cases[m, _?(Length[#] == 7 &)] -> mark]],
mark]]

findsequence[l_] :=
Module[{m = Split[l]},
Map[Length[Flatten[Take[m, # - 1]]] + 1 &,
Flatten[Map[Position[m, #] &, Select[m, Length[#] == 7 &]]]]]

At this point of course a second problem emerges. Namely, should a sequence
of 8 consecutive identical integers count as two sequences of 7 consecutive
integers or none? If the former than Split is not really the appropriate
function to use.  So to make my task easier I shall assume that 7 means
exactly seven and not eight but of course Hans's program as well as my
findsequence and findsequence make the other assumption. Oh, well.

Clearly findsequence, which was the fastest in my earlier test is not
suitable for the purpose of finding more than one such sequence: it's
performance depends on the fact that unlike the other programs it exits the
moment it found the sequence  it was looking for. Even then it's performance
will depend very much on whether the sequence of identical integers is near
the begining or near the end of the whole sequence. It may be interesting to
look at what hapens in the extreme cases:
with

l = Join[{1, 1, 1, 1, 1, 1, 1}, Table[Random[Integer, {1, 9}], {10000}]];

I get:

In:=
Table[findsequence[i][l] // Timing, {i, 1, 5}]
1
Out=
{{1.41667 Second,  Null}, {0.0166667 Second, {{1}}}, {0.883333 Second,
{{1}}}, {0.766667 Second, {1}}, {0.0166667 Second, 1}}

with
l = Join[Table[Random[Integer, {1, 9}], {10000}], {1, 1, 1, 1, 1, 1, 1}];

the results are of course quite different:

In:=
Table[findsequence[i][l] // Timing, {i, 1, 5}]
10001
Out=
{{1.41667 Second,
Null}, {30.7667 Second, {{10001}}}, {0.883333 Second, {{10001}}}, \
{0.816667 Second, {10001}}, {2.13333 Second, 10001}}

So really to decide which is the fastest one would have to study the average
case. I do not intend to try to do this but here is just a single random
run:

In:=
l = Flatten[
Insert[Table[Random[Integer, {1, 9}], {10000}], {1, 1, 1, 1, 1, 1, 1},
Random[Integer, {1, 10000}]]];

In:=
Table[findsequence[i][l] // Timing, {i, 1, 5}]
3734
Out=
{{1.18333 Second,
Null}, {11.3333 Second, {{3734}}}, {0.866667 Second, {{3734}}}, {0.75
Second, {3734}}, {0.733333 Second, 3734}}

Of course I was lucky here I did not get a sequence of 8 identical integers
for indsequence and findsequence would have ignored it,
findsequence and findsequence counted it twice and findsequence
just once.

Andrzej Kozlowski
Toyama International University
JAPAN
http://sigma.tuins.ac.jp
http://eri2.tuins.ac.jp

----------
>From: "Andrzej Kozlowski" <andrzej at tuins.ac.jp>
To: mathgroup at smc.vnet.net
>To: Hans Havermann <haver at total.net>, mathgroup at smc.vnet.net
>Subject: [mg19932] Re: [mg19880] Fast List-Selection
>Date: Tue, Sep 21, 1999, 6:19 AM
>

> I have managed to produce four additional examples, three  of which seem to
> be faster than yours. I will change the search to 6 rather than 7
> consequtive identical elements because I want to use for my testing a
> famous example:
>
> In:=
> l = RealDigits[N[Pi, 10000]][];
>
> Hear are the functions I will test, beginning with yours:
>
> In:=
> findsequence[l_] := Do[If[Count[t = Take[l, {i, i + 5}], t[]] == 6,
> Print[i]], {i, 1, Length[l] - 5}]
>
> In:=
> findsequence[l_] :=
>   Position[l /. {a___, y_, y_, y_, y_, y_, y_, b___} -> {a, mark, b}, mark, 1]
>
> In:=
> findsequence[l_] :=
>   Module[{m = Split[l], mark},
>     Position[Flatten[m /. Cases[m, _?(Length[#] == 6 &)][] -> mark], mark,
>       1]]
>
> In:=
> findsequence[l_] :=
>   Module[{m = Split[l]},
>     Length[Flatten[
>           Take[m, Position[m, Select[m, Length[#] == 6 &][]][[1, 1]] -
>               1]]] + 1]
>
> In:=
> f[x_, x_, x_, x_, x_, x_] := 0;
> f[y__] := 1;
> g[l_, i_] := f[Apply[Sequence, Take[l, {i, i + 5}]]];
> findsequence[l_] := Scan[If[g[l, #] == 0, Return[#]] &, Range[Length[l]]];
>
> Now the test:
>
> In:=
> Table[findsequence[i][l] // Timing, {i, 1, 5}]
> From In:=
> 763
> Out=
> {{1.38333 Second,
>     Null}, {2.18333 Second, {{763}}}, {0.783333 Second, {{763}}}, {0.683333
> Second, 763}, {0.183333 Second, 763}}
>
> The last one wins by a big marigin. The programs using Split may do better
> on other machines (I am using Mac PowerBook G3 ,233 mghz) because there
> seems tobe something wrong with Split on the Mac where it doesn't scale
> linearly with the zise of the input.
>
> Finally: In:=
> Take[l, {763, 763 + 5}]
> Out=
> {9, 9, 9, 9, 9, 9}
> --
> Andrzej Kozlowski
> Toyama International University
> JAPAN
> http://sigma.tuins.ac.jp
> http://eri2.tuins.ac.jp
>
>
> ----------
>>From: Hans Havermann <haver at total.net>
To: mathgroup at smc.vnet.net
>>To: mathgroup at smc.vnet.net
>>Subject: [mg19932] [mg19880] Fast List-Selection
>>Date: Mon, Sep 20, 1999, 7:47 AM
>>
>
>> I have a list 's' composed of a large number of (small) integers. I wish to
>> search this list for instances of 7 consecutive, identical elements.
>>
>> My approach is:
>>
>> Do[If[Count[t = Take[s, {i, i + 6}], t[]] == 7,
>>     Print[i]], {i, 1, Length[s] - 6}]
>>
>> Can anyone think of a *faster* way of doing this?
>>
>>
>>

```

• Prev by Date: Re: Re: Limits of multi-var. functions
• Next by Date: Challenge Mathematica vs Excel Addins
• Previous by thread: Re: Fast List-Selection
• Next by thread: Re: Re: Fast List-Selection