MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Pure Function for String Selection

  • To: mathgroup at
  • Subject: [mg61915] Re: Pure Function for String Selection
  • From: "dkr" <dkrjeg at>
  • Date: Fri, 4 Nov 2005 05:11:40 -0500 (EST)
  • References: <dht479$hl1$><di7rft$kq3$>
  • Sender: owner-wri-mathgroup at


Here is one last crack at your filtering problem. It is much simpler
than my previous filters and very competitive in terms of speed.



We simply form a master string from your list of strings using
ToString, and then use a Regular Expression to weed out the original
strings with bad runs.
Explanation of the regular expression:
If we wanted to simply pull out the original strings from the master
string, we could do this using
StringCases[ToString[origList], RegularExpression[\\b"[^,]+\\b"]];
The regular expression characterizes strings that lie between word
boundaries (in this example the lefthand word boundaries take the form
of either { or whitespace, while the righthand word boundaries take the
form of either a comma or a righthand brace ) and consist of 1 or more
characters that are not commas.  [^,]+ will match as large a string as
possible, and hence your original strings will be generated.  Then to
generate only those that don't have bad runs we insert the "negative
lookahead" condition (?![^,]*(XXXXXX|222222|11111111)).  It essentially
requires that the following text cannot begin with 0 or more characters
that are not commas followed by a bad run.  This suffices to rule out
your bad strings.  Since I am a novice as far as regular expressions
go, it is likely that somewhat can suggest an alternative regular
expression that will be even faster.

Below I have repeated the tables from my previous message, adding a
line for filter11 to each table.


 6Alt2          egList4          0.935
 9                 egList4          1.33
 10               egList4          1.035
 11		egList4		  0.91

 6Alt2          egList3          0.63
 9                 egList3          0.59
 10               egList3          0.585
 11               egList3          0.475

 6Alt2          egList2          0.06
 9                 egList2          0.03
 10               egList2          0.025
 11               egList2          0.02

* Average of two runs.  Mathematica was restarted before each run for
 each filter.


Filter          origList          Time(secs)*
 6Alt2          egList4          0.935
 9                 egList4          1.305
 10               egList4          1.04
 11                egList4          0.975

6Alt2          egList3          0.645
 9                 egList3          0.65
 10               egList3          0.58
 11               egList3          0.465

6Alt2          egList2          0.07
 9                 egList2          0.03
 10               egList2          0.03
 11               egList2          0.025

* Average of two runs.  Mathematica was restarted before each run for
 each filter.

Thus, as with your earlier string reduction problem, using a master
string and exploiting Mathematica's powerful string pattern capabilites
may be a useful approach, especially when coupled with Maxim Rytin's
excellent suggestion of using regular expressions. I don't believe
there is an analogue in Mathematica's StringExpression for the type of
lookahead condition that was used in filter11.


  • Prev by Date: Re: Re: How was this typed?
  • Next by Date: Re: ExportString[exp, "MathML"]
  • Previous by thread: Re: "gray" lines in grids?
  • Next by thread: Inconsistent evaluation