Pattern match question
- To: mathgroup at smc.vnet.net
- Subject: [mg68385] Pattern match question
- From: glassymeow at yahoo.com
- Date: Thu, 3 Aug 2006 06:07:13 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
hi
txt = "ZACCZBNRCSAACXBXX";
letters = "ABC";
i want to find the first occurrences of any of the
six combinations of the letters of the set "ABC" Globally, and
without overlap option. and the space between letters does not
important.
in the above txt string the result must be:
Out[]:=
ACCZB
CSAACXB
i wish a solution using mathematica regular expressions.
the Regex pattern (A|B|C).*?(A|B|C).*?(A|B|C) will give the out:
ACC , BNRCSA , ACXB because it considers the permutations
and not the combinations
the following is an old fashion program which will emulate the human
pencil and
paper method, will solve the problem, but i am sure there are a better
solutions.
txt = "ZACCZBNRCSAACXBXX";
letters = "ABC";
ptrnLtrs = "";
(* make a string of 26 zero's as the number of the alphbet*)
For[i = 1, i <= 26, ptrnLtrs = StringJoin[ptrnLtrs, "0"]; i++]
(* replace every letter of the pattern letters *)
(* with a corresponding 1 in the string of the zero's *)
For[i = 1, i <= StringLength[letters],
num = ToCharacterCode[StringTake[letters, {i, i}]];
num = num - 64;
ptrnLtrs = StringReplacePart[ptrnLtrs, "1", Flatten[{num, num}]];
i++];
(* the procedural pattern match *)
ptrnLtrsBak = ptrnLtrs; y = 0; (* backup for the ptrnLtrs *)
beginFlag = 0; result = ""; lst = {};
For[i = 1, i <= StringLength[txt],
OneLetter = StringTake[txt, {i, i}];
If[beginFlag == 0 && StringCases[letters, OneLetter] == {},
Goto[jmp]];
num = ToCharacterCode[StringTake[txt, {i, i}]] - 64;
If[StringTake[ptrnLtrs, num] == "1",
result = StringJoin[result, OneLetter];
ptrnLtrs = StringReplacePart[ptrnLtrs, "0", Flatten[{num,
num}]];
, result = StringJoin[result, OneLetter];];
beginFlag = 1;
If[ToExpression[ptrnLtrs] == 0, ptrnLtrs = ptrnLtrsBak;
Print[result];
result = ""; beginFlag = 0;];
Label[jmp];
i++]
Out[]:=
ACCZB
CSAACXB
regards
peter glassy