MathGroup Archive 2009

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: StringCases and Shortest

  • To: mathgroup at smc.vnet.net
  • Subject: [mg97361] Re: StringCases and Shortest
  • From: "Sjoerd C. de Vries" <sjoerd.c.devries at gmail.com>
  • Date: Thu, 12 Mar 2009 02:14:52 -0500 (EST)
  • References: <gp8047$1k9$1@smc.vnet.net>

Hi grischika,

As I understand it, Mathematica works itself throught the string from
left to right. It starts at the first character and normally tries to
find the largest substring that matches the string pattern. It the
moves to the next position to try to find the next substring match.

This movement is controlled by the Overlap option. If False (default),
the next try will be the position following the last matched
substring. If True, the next try will be the next character. If All,
Mathematica will try to find smaller matching substring starting at
the same character.

If the pattern is embedded in a Shortest function, the match will be
the shortest of a series of possible matches *starting from the
INITIAL character*. So, with your example "(-(a)--(bb)--(c)-",
starting at the first character, we have matches substrings "(-(a)--
(bb)--(c)", "(-(a)--(bb)", and "(-(a)". The shortest of these is "(-
(a)". With the default Overlap->False, StringCases now moves on to "--
(bb)--(c)" to try to find further matches. Hence, the possible (a)
match is now skipped over. If you set Overlap->All or True this match
would also have been found.

If you don't want these substrings to contain opening parentheses
themselves you have to say so:

StringCases["(-(a)--(bb)--(c)-",
 Shortest["(" ~~ x__ ~~ ")"] /; StringFreeQ[x, "("] ]

If you want to remove the parenthesis around the matched substrings
you can use:

StringCases["(-(a)--(bb)--(c)-",
 Shortest["(" ~~ x__ ~~ ")"] /; StringFreeQ[x, "("] -> x]

Cheers -- Sjoerd


On Mar 11, 11:26 am, Grisch... at mail.ru wrote:
> Hello!
> I want to select shortest substring between brackets from the string.
> For example:
>
> Func["f(a+b) some text (comments)" ]
>
>  should give:
>
> {"a+b","comments"},
>
> and
>
> Func["(f(a+b) some text (comments)" ]
>
> should give:
>
> {"(a+b)","(comments)"}  too.
>
> In the help I found  this line:
>
> in[]:  StringCases["-(a)--(bb)--(c)-", Shortest["(" ~~ __ ~~ ")"]]
> out: {"(a)","(bb)","(c)"}
>
> which, at first sight, works as I desire.
>
> But when I add bracket at start of line then answer is incorrect
> in[]:  StringCases["(-(a)--(bb)--(c)-", Shortest["(" ~~ __ ~~ ")"]]
> out: {"(-(a)","(bb)","(c)"}
>
> What is wrong? And how to solve this problem?



  • Prev by Date: Re: Piechart labels Mathematica 7
  • Next by Date: Re: A newbee to Mathematica
  • Previous by thread: Re: StringCases and Shortest
  • Next by thread: Piechart labels Mathematica 7