MathGroup Archive 2010

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Cases vs. StringCases vs. Select and StringMatchQ vs. StringFreeQ

  • To: mathgroup at smc.vnet.net
  • Subject: [mg111196] Re: Cases vs. StringCases vs. Select and StringMatchQ vs. StringFreeQ
  • From: Bill Rowe <readnews at sbcglobal.net>
  • Date: Fri, 23 Jul 2010 07:12:12 -0400 (EDT)

On 7/22/10 at 5:41 AM, d.latin at gmail.com (David Latin) wrote:

>Hello, I am currently working on manipulating data in "vCard"-like
>format, and have become confused by the actions of the Cases,
>StringCases and Select functions. Consider the small list:

>In[1]:= list = {"DTEND:19260412T175900", "DTEND:20070207T050000",
>"END:VCALENDAR", "MM"} ;

>In[2]:= Cases[list, ___~~"END:"~~___] Out[2]= {}

>So pattern-matching obviously does not work with Cases for a list of
>strings.

Patterns and string patterns simply aren't the same. So, do

In[12]:= Cases[list, _?(StringMatchQ[#, ___ ~~ "END:" ~~ ___] &)]

Out[12]= {DTEND:19260412T175900,DTEND:20070207T050000,END:VCALENDAR}

>The documentation for Cases does not refer to patterns in strings,
>so I tried

>In[3]:= StringCases[list, ___~~"END:"~~___] Out[3]=
>{{"DTEND:19260412T175900"},{"DTEND:20070207T050000"},{"END:VCALENDAR
>"},{}}

>The problem here is that empty elements can be returned.

That is easily fixed by doing either

In[13]:= DeleteCases[StringCases[list, ___ ~~ "END:" ~~ ___], {}]

Out[13]= {{"DTEND:19260412T175900"}, {"DTEND:20070207T050000"},
    {"END:VCALENDAR"}}

or

In[14]:= StringCases[list, ___ ~~ "END:" ~~ ___] /. {} -> Sequence[]

Out[14]= {{"DTEND:19260412T175900"}, {"DTEND:20070207T050000"},
    {"END:VCALENDAR"}}

>So next I tried

>In[4]:= Select[list, ___~~"END:"~~___] Out[4]= {}

>Obviously not working.

Here, like with Cases a pure function using StringMatchQ will do
what you need. That is,

In[15]:= Select[list, StringMatchQ[#, ___ ~~ "END:" ~~ ___] &]

Out[15]= {DTEND:19260412T175900,DTEND:20070207T050000,END:VCALENDAR}

>Next I tried

>In[5]:= Select[ list, StringMatchQ[#, "*END:*"] & ] Out[5]=
>{"DTEND:19260412T175900", "DTEND:20070207T050000", "END:VCALENDAR"}

>This is fine. But what if I only want the "END:" lines and not the
>"DTEND:" lines ?

Change the pattern to be matched. For example,

In[16]:= Select[list, StringMatchQ[#, "END:" ~~ ___] &]

Out[16]= {END:VCALENDAR}

>It may be appropriate to make use of

>In[6]:= Select[ list, StringFreeQ[#, "*DTEND:*"] & ] Out[6]=
>{"DTEND:19260412T175900", "DTEND:20070207T050000", "END:VCALENDAR",
>"MM"}

>Not as expected!

Since StringFreeQ[string, pattern] returns true when a substring
of string matches pattern, it isn't sensible to supply a pattern
like ___~~pattern~~___. This just causes Mathematica to do more
work than needed to achieve the desired result. So, do

In[17]:= Select[list, StringFreeQ[#, "DTEND:"] &]

Out[17]= {END:VCALENDAR,MM}

Also, note the documentation for StringMatchQ under more
information states "... ordinary StringExpression string
patterns, as well as abbreviated string patterns containing the
following metacharacters:" and specifically states a "*" is
interpreted as zero or more characters. The documentation for
StringFreeQ does not have any similar statement. So, I suspect
for StringFreeQ, an "*" is taken to be a literal asterisk. Since
none of strings in your list have a literal asterisk, all would
be selected if StringFreeQ is interpreting the "*" at the end of
you patterns as a literal asterisk.

>But, in the end, what works is:

>In[7]:= Select[ list, StringMatchQ[#, "*END:*"] && ! StringMatchQ[#,
>"*DTEND:*"] & ] Out[7]= {"END:VCALENDAR"}

>I know I could have used "END*" instead of "*END*", but that's not
>the point here.

>My questions then are: Why doesn't Cases work for a list of strings
>? Why doesn't Select work for patterns with the ~~ operator ?

Neither Cases nor Select is designed to use string patterns. You
can use string patterns with these by creating a pattern or
function that will evaluate to true or false using any of the
functions that do accept string patterns as arguments.

>Why doesn't StringFreeQ act in the same way as !StringMatchQ ?

Why are you expecting these to be the same? StringFreeQ[string,
pattern] returns true whenever no substring of string matches
pattern. !StringMatchQ[string, pattern] returns true whenever
the entire string fails to match pattern. There is a clear
difference between matching a substring of a given string and
the entire string.



  • Prev by Date: Re: Very very basic question about Mathematica expressions
  • Next by Date: Experimental Data Analyst
  • Previous by thread: Cases vs. StringCases vs. Select and StringMatchQ vs. StringFreeQ
  • Next by thread: Mathematica Special Interest Group (Washington DC Area)