MathGroup Archive 1998

[Date Index] [Thread Index] [Author Index]

Search the Archive

String patterns




Programming challenge:

Is there an elegant means of doing cryptanalysis in Mathematica as
opposed to any other language.  I am mainly thinking of
pattern-matching functions.  In this case, the pattern would be
dynamic, not predefined.  I am not certain how to create and test
patterns on the fly.

The primary task is to count letter, digraph, trigraph, and higher-order
frequencies.

Output for the trigraph case might look like this:

THE    0.01350000
AND    0.00709421
ION    0.00559429
ING    0.00510783
TIO    0.00466191
ENT    0.00458083
RES    0.00417545
   <...etc....>
BEP    0.00004054


The real number represents the fractional occurrence of the trigraph
among all trigraphs in the sample.  These were computed by a DOS
utility on a particular sample text.  The word "the" occurred 333 times
out of 24668 total trigraph sequences, giving an estimated probability
for this trigraph of 333/24668=0.01350000.

Trigraphs overlap.  If I parse the following phrase,

     "I love Mathematica"

then the first trigraph is "I l" (spaces count), the second is " lo",
and the third is "lov".

One must define an "alphabet" with a sorting order.  A good way to do
this is with a string variable like this:

     "abcdefghijklmno..."

How good is Mathematica at this kind of string manipultion and
searching?

Mark Evans
evans@gte.net





  • Prev by Date: Thank you
  • Next by Date: Fonts in graphics
  • Prev by thread: Re: Thank you
  • Next by thread: Re: String patterns