MathGroup Archive 2005

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: a conflicting StringReplace

  • To: mathgroup at smc.vnet.net
  • Subject: [mg56332] Re: [mg56306] a conflicting StringReplace
  • From: "Wolf, Hartmut" <Hartmut.Wolf at t-systems.com>
  • Date: Fri, 22 Apr 2005 06:23:13 -0400 (EDT)
  • Sender: owner-wri-mathgroup at wolfram.com

>-----Original Message-----
>From: Hui Fang [mailto:fangh73 at xmu.edu.cn] 
To: mathgroup at smc.vnet.net
>Sent: Thursday, April 21, 2005 11:37 AM
>Subject: [mg56332] [mg56306] a conflicting StringReplace
>
>I was teaching Mathematica in a college. In the class I was 
>showing them 
>some built-in functions about strings. Since this is not a very 
>important issue, I didn't spend much time on each function. 
>When I show 
>them StringReplace, I gave them the following examples:
>In[1]    StringReplace["abc",{"ab"->"AB"}]
>Out[1]   ABc
>
>In[2]   StringReplace["abc", {"bc"->"BC"}]
>Out[2]   aBC
>
>No problem on those. Now a student tried the following:
>In[3]   StringReplace["abc", {"ab"->"AB", "bc"->"BC"}]
>Out[3]   ABc
>
>Now he asked me why only "ab" is replaced. I said this is 
>because there 
>is a conflict because both "ab" and "bc" contains "b". So Mathematica 
>will choose the first replacement. I also told him if he changes the 
>order, he will get aBC. Now:
>In[4]    StringReplace["abc", {"bc"->"BC","ab"->"AB"}]
>Out[4]    ABc
>
>This is the part I don't understand. Does Mathematica treat 
>those rules 
>in their canonical order (since "ab" is before "bc" in canonical 
>order.), or in their written order?
>
>Thanks a lot!
>
>Hui Fang      
>
>

The behavior is explained in Help:

StringReplace goes through a string, testing substrings that start at
each successive character position. On each substring, it tries in turn
each of the transformation rules you have specified. If any of the rules
apply, it replaces the substring, then continues to go through the
string, starting at the character position after the end of the
substring. 


What is not quite clear (to me) from that explanation is the meaning of
"each substring": does this just mean the rest of the string starting at
current position, or is each substring of different length (starting
there) considered as different (such that the next rule is tried first,
before we further run down the string? An experiment shows, that the
first assumtion applies (and this gives the algorithm that performs
better)


StringReplace goes through a string, testing substrings that start at
each successive character position. 

-- starting at position of "a" in string "abc" 


On each substring, 

-- at starting position it's: "abc" 

it [StringReplace] tries in turn each of the transformation rules you
have specified. 

-- so first it tries "b" of "bc" on "a" of "abc" --> fail, try next rule
-- then tries "a" of "ab" on "a" of "abc" --> interesting, go on
-- tries "b" of "ab" on "b" of "abc" --> interesting, go on
-- pattern is exhausted, such we have --> success of pattern on
substring "ab" of "abc"

If any of the rules apply, it replaces the substring, 

-- so here "ab" of "abc" becomes "AB", i.e the string becomes

"ABc"

then continues to go through the string, starting at the character
position after the end of the substring. 

-- Substring considered next is "c" of "abc" (or "ABc", replaced part is
not considered again)

-- so now we compare "b" of "bc" on "c" --> fail, try next rule
-- try "a" of "ab" on "c" --> fail, no more rule, advance position

-- but string is exhausted

-- so the result is "ABc"




Here are two more examples:


In[29]:= StringReplace["abc", {"ab" -> "ab", "bc" -> "bc",
            "a" -> "1", "b" -> "2", "c" -> "3"}]
Out[29]= "ab3"

here rule "a" -> "1" is masked by pattern "ab" which matches first, 
"bc" and "b" cannot match, as neither substring "bc" nor "b" are part of
substring "c" left over from "abc" after substitution "ab" -> "ab".



In[31]:= StringReplace["abc", {"a" -> "1", "ab" -> "ab",
            "bc" -> "bc", "b" -> "2", "c" -> "3"}]
Out[31]= "1bc"

here rule "a" -> "1" matches first, and substitution takes place. 
For rest of string "bc", from "abc", patterns "a" and "ab" will never
match, "bc" -> "bc" matches first, 
and such masks  rule "c" -> "3"




--
Hartmut Wolf


  • Prev by Date: Re: Re: Exact Symbolic Notation
  • Next by Date: Re: Re: (x-y) DiracDelta[x-y] does not simplify to 0
  • Previous by thread: Re: a conflicting StringReplace
  • Next by thread: Re: a conflicting StringReplace