Re: Re: Pure Function for String Selection

• To: mathgroup at smc.vnet.net
• Subject: [mg61047] Re: [mg61017] Re: Pure Function for String Selection
• From: "Edson Ferreira" <edsferr at uol.com.br>
• Date: Sat, 8 Oct 2005 02:49:26 -0400 (EDT)
• References: <dht479\$hl1\$1@smc.vnet.net> <200510070738.DAA03287@smc.vnet.net>
• Sender: owner-wri-mathgroup at wolfram.com

Well,

We have a new winner. Just look the code bellow.

It seems that I have developed a even faster function (filter4)...But I
can't explain why it's twice faster than yours...

Don't look at the results....I have a Pentium III :-(
Just compare them ;-)
My results are always the last ones in each output!!

In[1]:=
selectString1Q[str_String] :=
Module[{ch=Characters[str]},ch={First[#],Length[#]}&/@Split[ch];
Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
] ;

In[2]:=
selectString2Q[str_String] :=
Module[{ch},
ch =StringCases[str,y:(x_)..:>{x,StringLength[y]}];
Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
] ;

In[3]:=
maxseq[s_String,z_String]:=Max[StringLength/@StringCases[s,z..]];
selectString3Q[str_String]:=
maxseq[str,"1"]<8&&maxseq[str,"X"]<6&&maxseq[str,"2"]<6;

In[5]:=
selectString4Q[str_String]:=
StringFreeQ[str,"1"~~"1"~~"1"~~"1"~~"1"~~"1"~~"1"~~"1"] &&
StringFreeQ[str,"X"~~"X"~~"X"~~"X"~~"X"~~"X"] &&
StringFreeQ[str,"2"~~"2"~~"2"~~"2"~~"2"~~"2"];

In[6]:=
filter1[origList:{__String}]:=Select[origList,selectString1Q[#]&];
filter2[origList:{__String}]:=Select[origList,selectString2Q[#]&];
filter3[origList:{__String}]:=Select[origList,selectString3Q[#]&];
filter4[origList:{__String}]:=Select[origList,selectString4Q[#]&];

In[10]:=
makeList[strLen_,listLen_]:=
Table[StringJoin[{"1","2","X"}\[LeftDoubleBracket]
Table[Random[
Integer,{1,3}],{strLen}]\[RightDoubleBracket]],{listLen}];

In[11]:=
egList1=makeList[14,1000];
egList2=makeList[30,1000];
egList3=makeList[30,20000];
egList4=makeList[100,20000];

In[15]:=
{Timing[Length@filter3[egList1]],Timing[Length@filter2[egList1]],
Timing[Length@filter1[egList1]],Timing[Length@filter4[egList1]]}

Out[15]=
{{0.41 Second,977},{1.262 Second,977},{1.152 Second,977},{0.23 Second,977}}

In[16]:=
{Timing[Length@filter1[egList2]],Timing[Length@filter3[egList2]],
Timing[Length@filter2[egList2]],Timing[Length@filter4[egList2]]}

Out[16]=
{{2.023 Second,958},{0.441 Second,958},{2.173 Second,958},{0.26 Second,958}}

In[17]:=
{Timing[Length@filter2[egList3]],Timing[Length@filter1[egList3]],
Timing[Length@filter3[egList3]],Timing[Length@filter4[egList3]]}

Out[17]=
{{41.8 Second,19018},{40.158 Second,19018},{8.442 Second,19018},{4.767
Second,19018}}

In[18]:=
{Timing[Length@filter2[egList4]],Timing[Length@filter3[egList4]],
Timing[Length@filter1[egList4]],Timing[Length@filter4[egList4]]}

Out[18]=
{{115.216 Second,16562},{13.349 Second,16562},{112.522 Second,16562},{5.778
Second,16562}}

Thanks for all the help!!!

Edson Ferreira
Mechanical Engineer - Brazil

----- Original Message -----
To: mathgroup at smc.vnet.net
Subject: [mg61047] [mg61017] Re: Pure Function for String Selection

> Edson,
>
> Consider the following 3 predicates:
>
> In[1]:=
> selectString1Q[str_String] :=
> Module[{ch=Characters[str]},ch={First[#],Length[#]}&/@Split[ch];
>      Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
>        Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] <  6 &&
>        Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] <  8
> ] ;
>
> selectString2Q[str_String] :=
>     Module[{ch},
>      ch =StringCases[str,y:(x_)..:>{x,StringLength[y]}];
>      Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
>        Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] <  6 &&
>        Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] <  8
>       ] ;
>
> maxseq[s_String,z_String]:=Max[StringLength/@StringCases[s,z..]];
> selectString3Q[str_String]:=
>    maxseq[str,"1"]<8&&maxseq[str,"X"]<6&&maxseq[str,"2"]<6;
>
> selectString1Q was used by Jens-Peer Kuska in his previous response in
> this thread.  selectString2Q appeals to a pure string pattern matching
> technique to define ch rather than using Characters and Split as was
> done in selectString1Q.  selectString3Q is the predicate used in my
> previous response in this thread.  These 3 predicates can then be used
> to define the 3 filter functions:
>
> In[5]:=
> filter1[origList:{__String}]:=Select[origList,selectString1Q[#]&];
> filter2[origList:{__String}]:=Select[origList,selectString2Q[#]&];
> filter3[origList:{__String}]:=Select[origList,selectString3Q[#]&];
>
> Finally define:
>
> In[8]:=
> makeList[strLen_,listLen_]:=
>    Table[StringJoin[{"1","2","X"}\[LeftDoubleBracket]
>          Table[Random[
>
> Integer,{1,3}],{strLen}]\[RightDoubleBracket]],{listLen}];
>
> makeList generates a list of listLen strings, with each string having
> length strLen and consisting of a random sequence of the three
> characters "1", "2", and "X".
>
> Now let's attempt to get a very rough feel for the relative speed of
> the 3 filter functions:
>
> In[9]:=
> egList1=makeList[14,1000];
> egList2=makeList[30,1000];
> egList3=makeList[30,20000];
> egList4=makeList[100,20000];
>
> In[13]:=
> {Timing[Length@filter3[egList1]],Timing[Length@filter2[egList1]],
>  Timing[Length@filter1[egList1]]}
> Out[13]=
> {{0.14 Second,986},{0.42 Second,986},{0.33 Second,986}}
>
> In[14]:=
> {Timing[Length@filter1[egList2]],Timing[Length@filter3[egList2]],
>  Timing[Length@filter2[egList2]]}
> Out[14]=
> {{0.72 Second,958},{0.09 Second,958},{0.37 Second,958}}
>
> In[15]:=
> {Timing[Length@filter2[egList3]],Timing[Length@filter1[egList3]],
>  Timing[Length@filter3[egList3]]}
> Out[15]=
> {{7.85 Second,19057},{7.41 Second,19057},{1.68 Second,19057}}
>
> In[16]:=
> {Timing[Length@filter2[egList4]],Timing[Length@filter3[egList4]],
>  Timing[Length@filter1[egList4]]}
> Out[16]=
> {{21. Second,16645},{3.03 Second,16645},{21.11 Second,16645}}
>
> (Note that the filter functions appear in a different order in each of
> the four inputs.)
>
> In all cases filter3 is the clear winner, with the two other filter
> functions being fairly close.  The main difference between filter3 and
> the other two is that it makes three separate passes on each string
> with StringCases in order to determine the runs, while the other two
> functions essentially generate a single list of runs for all the 3
> characters combined (filter1 by making a single pass with Split and
> filter 2 a single pass with StringCases), and then have to go back and
> identify which runs belong to which character from this generated set.
> In any event, if these timing results survive more extensive testing,
> it may suggest that working with your strings directly rather than
> decomposing your strings via Characters may be the way to go.  This
> would be in accord with what you found in working with the string
> reduction problem which you discussed in a previous thread.
>
>

• Prev by Date: Re:Re: Pure Function for String Selection
• Next by Date: Dashing shown in legend?
• Previous by thread: Re: Pure Function for String Selection
• Next by thread: Re: Pure Function for String Selection