newbie is looking for a customDistribution function

• To: mathgroup at smc.vnet.net
• Subject: [mg50383] newbie is looking for a customDistribution function
• From: János <janos.lobb at yale.edu>
• Date: Wed, 1 Sep 2004 01:49:22 -0400 (EDT)
• Sender: owner-wri-mathgroup at wolfram.com

```Hi,

I looked for it in the archives, but found none.  I am looking for ways
to create a custom distribution, which I can call as a function.  Here
is an example for illustration.  Let's say I have a list created from a
4 elements alphabet  {a,b,c,d}:

In[1]:=
lst={a,a,b,c,a,d,a,c,c,a}

Out[1]=
{a,a,b,c,a,d,a,c,c,a}

Distribute gives me - thanks David Park - all the two element
combinations of {a,b,c,d}

In[11]:=
twocombs=Distribute[Table[{a,b,c,d},{2}],List]

Out[11]=
{{a,a},{a,b},{a,c},{a,d},{b,a},{b,b},{b,c},{b,d},{c,a},{c,b},{c,c},{c,d}
,{
d,a},{d,b},{d,c},{d,d}}

I can count the occurrence of an element of twocombs in lst with the
following function:

occuranceCount[x_List] := Count[Partition[lst, 2, 1], x]

Mapping this function over twocombs gives me the number of occurances
of elements of twocombs in lst:

In[12]:=
distro=Map[occuranceCount,twocombs]

Out[12]=
{1,1,1,1,0,0,1,0,2,0,1,0,1,0,0,0}

It shows that for example {c,a} occurs twice, {d,a} occurs once and
{d,c} or {d,d} never occur.

Now, I would like to create a distribution function called
twocombsLstDistribution which I could call and it would give me back
elements of twocombs with the probability as they occur in distro, that
is for on average I would get twice as much {c,a}s as {d,a}s and never
get {d.c} or {d,d}.

How can I craft that ?

/Of course I need it for an arbitrary but finite length string lst over
a fixed length alphabet {a,b,c,d,....} for k-length elements of kcombs,
and it has to be super fast  :).  My real lst is between 30,000 and
70,000 element long over a four element alphabet and I am looking for k
between 5 and a few hundred. /