Re: Re: Pure Function for String Selection
- To: mathgroup at smc.vnet.net
- Subject: [mg61047] Re: [mg61017] Re: Pure Function for String Selection
- From: "Edson Ferreira" <edsferr at uol.com.br>
- Date: Sat, 8 Oct 2005 02:49:26 -0400 (EDT)
- References: <dht479$hl1$1@smc.vnet.net> <200510070738.DAA03287@smc.vnet.net>
- Sender: owner-wri-mathgroup at wolfram.com
Well,
We have a new winner. Just look the code bellow.
It seems that I have developed a even faster function (filter4)...But I
can't explain why it's twice faster than yours...
Don't look at the results....I have a Pentium III :-(
Just compare them ;-)
My results are always the last ones in each output!!
In[1]:=
selectString1Q[str_String] :=
Module[{ch=Characters[str]},ch={First[#],Length[#]}&/@Split[ch];
Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
] ;
In[2]:=
selectString2Q[str_String] :=
Module[{ch},
ch =StringCases[str,y:(x_)..:>{x,StringLength[y]}];
Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
] ;
In[3]:=
maxseq[s_String,z_String]:=Max[StringLength/@StringCases[s,z..]];
selectString3Q[str_String]:=
maxseq[str,"1"]<8&&maxseq[str,"X"]<6&&maxseq[str,"2"]<6;
In[5]:=
selectString4Q[str_String]:=
StringFreeQ[str,"1"~~"1"~~"1"~~"1"~~"1"~~"1"~~"1"~~"1"] &&
StringFreeQ[str,"X"~~"X"~~"X"~~"X"~~"X"~~"X"] &&
StringFreeQ[str,"2"~~"2"~~"2"~~"2"~~"2"~~"2"];
In[6]:=
filter1[origList:{__String}]:=Select[origList,selectString1Q[#]&];
filter2[origList:{__String}]:=Select[origList,selectString2Q[#]&];
filter3[origList:{__String}]:=Select[origList,selectString3Q[#]&];
filter4[origList:{__String}]:=Select[origList,selectString4Q[#]&];
In[10]:=
makeList[strLen_,listLen_]:=
Table[StringJoin[{"1","2","X"}\[LeftDoubleBracket]
Table[Random[
Integer,{1,3}],{strLen}]\[RightDoubleBracket]],{listLen}];
In[11]:=
egList1=makeList[14,1000];
egList2=makeList[30,1000];
egList3=makeList[30,20000];
egList4=makeList[100,20000];
In[15]:=
{Timing[Length@filter3[egList1]],Timing[Length@filter2[egList1]],
Timing[Length@filter1[egList1]],Timing[Length@filter4[egList1]]}
Out[15]=
{{0.41 Second,977},{1.262 Second,977},{1.152 Second,977},{0.23 Second,977}}
In[16]:=
{Timing[Length@filter1[egList2]],Timing[Length@filter3[egList2]],
Timing[Length@filter2[egList2]],Timing[Length@filter4[egList2]]}
Out[16]=
{{2.023 Second,958},{0.441 Second,958},{2.173 Second,958},{0.26 Second,958}}
In[17]:=
{Timing[Length@filter2[egList3]],Timing[Length@filter1[egList3]],
Timing[Length@filter3[egList3]],Timing[Length@filter4[egList3]]}
Out[17]=
{{41.8 Second,19018},{40.158 Second,19018},{8.442 Second,19018},{4.767
Second,19018}}
In[18]:=
{Timing[Length@filter2[egList4]],Timing[Length@filter3[egList4]],
Timing[Length@filter1[egList4]],Timing[Length@filter4[egList4]]}
Out[18]=
{{115.216 Second,16562},{13.349 Second,16562},{112.522 Second,16562},{5.778
Second,16562}}
Thanks for all the help!!!
Edson Ferreira
Mechanical Engineer - Brazil
----- Original Message -----
From: "dkr" <dkrjeg at adelphia.net>
To: mathgroup at smc.vnet.net
Subject: [mg61047] [mg61017] Re: Pure Function for String Selection
> Edson,
>
> Consider the following 3 predicates:
>
> In[1]:=
> selectString1Q[str_String] :=
> Module[{ch=Characters[str]},ch={First[#],Length[#]}&/@Split[ch];
> Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
> Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
> Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
> ] ;
>
> selectString2Q[str_String] :=
> Module[{ch},
> ch =StringCases[str,y:(x_)..:>{x,StringLength[y]}];
> Max[Last /@ Select[ch, MatchQ[#, {"X", _}] &]] < 6 &&
> Max[Last /@ Select[ch, MatchQ[#, {"2", _}] &]] < 6 &&
> Max[Last /@ Select[ch, MatchQ[#, {"1", _}] &]] < 8
> ] ;
>
> maxseq[s_String,z_String]:=Max[StringLength/@StringCases[s,z..]];
> selectString3Q[str_String]:=
> maxseq[str,"1"]<8&&maxseq[str,"X"]<6&&maxseq[str,"2"]<6;
>
> selectString1Q was used by Jens-Peer Kuska in his previous response in
> this thread. selectString2Q appeals to a pure string pattern matching
> technique to define ch rather than using Characters and Split as was
> done in selectString1Q. selectString3Q is the predicate used in my
> previous response in this thread. These 3 predicates can then be used
> to define the 3 filter functions:
>
> In[5]:=
> filter1[origList:{__String}]:=Select[origList,selectString1Q[#]&];
> filter2[origList:{__String}]:=Select[origList,selectString2Q[#]&];
> filter3[origList:{__String}]:=Select[origList,selectString3Q[#]&];
>
> Finally define:
>
> In[8]:=
> makeList[strLen_,listLen_]:=
> Table[StringJoin[{"1","2","X"}\[LeftDoubleBracket]
> Table[Random[
>
> Integer,{1,3}],{strLen}]\[RightDoubleBracket]],{listLen}];
>
> makeList generates a list of listLen strings, with each string having
> length strLen and consisting of a random sequence of the three
> characters "1", "2", and "X".
>
> Now let's attempt to get a very rough feel for the relative speed of
> the 3 filter functions:
>
> In[9]:=
> egList1=makeList[14,1000];
> egList2=makeList[30,1000];
> egList3=makeList[30,20000];
> egList4=makeList[100,20000];
>
> In[13]:=
> {Timing[Length@filter3[egList1]],Timing[Length@filter2[egList1]],
> Timing[Length@filter1[egList1]]}
> Out[13]=
> {{0.14 Second,986},{0.42 Second,986},{0.33 Second,986}}
>
> In[14]:=
> {Timing[Length@filter1[egList2]],Timing[Length@filter3[egList2]],
> Timing[Length@filter2[egList2]]}
> Out[14]=
> {{0.72 Second,958},{0.09 Second,958},{0.37 Second,958}}
>
> In[15]:=
> {Timing[Length@filter2[egList3]],Timing[Length@filter1[egList3]],
> Timing[Length@filter3[egList3]]}
> Out[15]=
> {{7.85 Second,19057},{7.41 Second,19057},{1.68 Second,19057}}
>
> In[16]:=
> {Timing[Length@filter2[egList4]],Timing[Length@filter3[egList4]],
> Timing[Length@filter1[egList4]]}
> Out[16]=
> {{21. Second,16645},{3.03 Second,16645},{21.11 Second,16645}}
>
> (Note that the filter functions appear in a different order in each of
> the four inputs.)
>
> In all cases filter3 is the clear winner, with the two other filter
> functions being fairly close. The main difference between filter3 and
> the other two is that it makes three separate passes on each string
> with StringCases in order to determine the runs, while the other two
> functions essentially generate a single list of runs for all the 3
> characters combined (filter1 by making a single pass with Split and
> filter 2 a single pass with StringCases), and then have to go back and
> identify which runs belong to which character from this generated set.
> In any event, if these timing results survive more extensive testing,
> it may suggest that working with your strings directly rather than
> decomposing your strings via Characters may be the way to go. This
> would be in accord with what you found in working with the string
> reduction problem which you discussed in a previous thread.
>
>
- References:
- Re: Pure Function for String Selection
- From: "dkr" <dkrjeg@adelphia.net>
- Re: Pure Function for String Selection