MathGroup Archive: September 2004 [00081]

[Date Index] [Thread Index] [Author Index]

Re: Re: newbie is looking for a customDistribution function

To: mathgroup at smc.vnet.net
Subject: [mg50452] Re: [mg50435] Re: newbie is looking for a customDistribution function
From: DrBob <drbob at bigfoot.com>
Date: Sat, 4 Sep 2004 01:43:33 -0400 (EDT)
References: <ch3o86$t96$1@smc.vnet.net> <ch6nlk$2d5$1@smc.vnet.net> <200409030736.DAA15603@smc.vnet.net>
Reply-to: drbob at bigfoot.com
Sender: owner-wri-mathgroup at wolfram.com

>> However, if you really want to use the distribution instead of the data
>> that gave rise to it then you should look into the "Alias Method" of
>> generating random observations from an arbitary discrete distribution.

How would we look into that? Is that in the Mathematica help files, math books, what?

Bobby

On Fri, 3 Sep 2004 03:36:14 -0400 (EDT), Ray Koopman <koopman at sfu.ca> wrote:

> koopman at sfu.ca (Ray Koopman) wrote in message
> news:<ch6nlk$2d5$1 at smc.vnet.net>...
>> János <janos.lobb at yale.edu> wrote in message news:<ch3o86$t96$1 at smc.vnet.net>...
>> [...]
>>> Now, I would like to create a distribution function called
>>> twocombsLstDistribution which I could call and it would give me back
>>> elements of twocombs with the probability as they occur in distro, that
>>> is for on average I would get twice as much {c,a}s as {d,a}s and never
>>> get {d.c} or {d,d}.
>>>
>>> How can I craft that ?
>>>
>>> /Of course I need it for an arbitrary but finite length string lst over
>>> a fixed length alphabet {a,b,c,d,....} for k-length elements of kcombs,
>>> and it has to be super fast  :).  My real lst is between 30,000 and
>>> 70,000 element long over a four element alphabet and I am looking for k
>>> between 5 and a few hundred. /
>>
>> For a 4-element alphabet, kcombs will have 4^k terms.
>> If k = "a few hundred", kcombs will be too big.
>> Why not just sort and count the k-sequences in the data?
>>
>> In[1]:= data = Table[Random[Integer,{1,4}],{100}]
>>
>> Out[1]= {2,4,3,3,3,4,3,2,3,3,1,3,2,2,4,1,4,4,4,1,2,3,3,4,1,
>>          2,1,4,1,1,2,2,4,3,3,1,2,4,2,3,4,2,2,2,3,4,3,4,3,2,
>>          2,3,3,3,1,3,3,1,3,1,1,1,1,4,2,2,3,4,2,4,3,4,3,1,4,
>>          4,3,4,4,1,3,2,1,2,4,2,4,1,1,2,3,2,4,3,1,4,3,4,4,1}
>>
>> In[2]:= With[{k = 3}, Reverse /@ Reverse@Sort@Map[{Length[#],#[[1]]}&,
>>                       Split@Sort[FromDigits/@Partition[data,k,1]]]]
>>
>> Out[2]= {{434, 4}, {343, 4}, {331, 4}, {243, 4}, {441, 3}, {313, 3},
>>          {234, 3}, {233, 3}, {223, 3}, {433, 2}, {432, 2}, {431, 2},
>>          {424, 2}, {422, 2}, {412, 2}, {411, 2}, {344, 2}, {342, 2},
>>          {334, 2}, {333, 2}, {322, 2}, {314, 2}, {242, 2}, {241, 2},
>>          {224, 2}, {144, 2}, {132, 2}, {124, 2}, {123, 2}, {112, 2},
>>          {111, 2}, {444, 1}, {443, 1}, {423, 1}, {414, 1}, {413, 1},
>>          {341, 1}, {324, 1}, {323, 1}, {321, 1}, {312, 1}, {311, 1},
>>          {232, 1}, {222, 1}, {214, 1}, {212, 1}, {143, 1}, {142, 1},
>>          {141, 1}, {133, 1}, {131, 1}, {122, 1}, {121, 1}, {114, 1}}
>
> Having read the other replies, I see that I missed your question,
> which is how to generate a random observation from the distribution
> of k-tuples in the observed data. By far the easiest way is to take
> a random k-tuple from the original data:
>
>           Take[data,{1,k}+Random[Integer,Length@data-k]]
>
> However, if you really want to use the distribution instead of the data
> that gave rise to it then you should look into the "Alias Method" of
> generating random observations from an arbitary discrete distribution.
>
>
>



-- 
DrBob at bigfoot.com
www.eclecticdreams.net

References:
- Re: newbie is looking for a customDistribution function
  - From: koopman@sfu.ca (Ray Koopman)

Prev by Date: Re: Parallel Toolkit Example

Next by Date: Re: Use of large memory

Previous by thread: Re: newbie is looking for a customDistribution function

Next by thread: Re: Re: newbie is looking for a customDistribution function