Re: RandomReal[] and sampling a distribution

• To: mathgroup at smc.vnet.net
• Subject: [mg110574] Re: RandomReal[] and sampling a distribution
• From: Peter Pein <petsie at dordos.net>
• Date: Sat, 26 Jun 2010 03:10:46 -0400 (EDT)
• References: <i023oo\$r8a\$1@smc.vnet.net>

```Am Fri, 25 Jun 2010 11:26:16 +0000 (UTC)
schrieb John Stone <stone at geology.washington.edu>:

> I hope this is a simple question (perhaps too simple for the
> archives, which I did search).  Thanks in advance for any help.
>
> I am trying to use RandomReal[ ] to sample from bins of different
> widths that span the interval 0 - 1.  The bin widths represent the
> weights I'm assigning to a family of trial solutions in an
> optimization problem.  The aim is to sample the solutions in
> proportion to their weights using a uniform distribution of random
> numbers generated by RandomReal[ ].
>
> For a simple example, however, suppose there are 10 equally weighted
> solutions.  My selection process would use some code that looks like:
>
> weights = Table[0.1, {10}];
> bins = Accumulate[weights];
> Select[bins, (# >= RandomReal[] &)][[1]]
>
> Assuming the result of RandomReal[ ] is uniformly distributed, I
> expected this to return 0.1 as frequently as it returns 0.5 or 1, but
> it seems to return a distribution of values peaked around 0.5 (and
> seldom returns 0.9 or 1).
>
> Graphically, the following I think is equivalent, and gives me a
> peaked, not a flat distribution:
>
> Histogram[
>   Table[
>    Select[{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1},
>      (# >= RandomReal[ ] &)][[1]],
>    {1000}]
> ]
>
> Shouldn't the procedure in my example generate a flat distribution?
> Or am I missing something really simple?
>
> Thanks again if you can help me solve this.

it is the call to Select[] which has a not too obvious trap:
Select takes the list bins and compares the first element (0.1) with a
random number (say 0.123). So 0.1 will not be in the result of Select[].
Next it compares 0.2 from the bins-list with _another_ random number
(say 0.3141). =.2 should have been in the result if the random value
were the same.

If you want to do it this way (Select[] is slow, compared to the
following suggestion), try:
(tmp = RandomReal[]; Select[bins, (# >= tmp &)][[1]]).

A simple test shows that this method needs ~6 seconds on my laptop for
10^5 samples:

With[{samples=100000},
BinCounts[Table[tmp=RandomReal[];Select[bins,(#>=tmp&)][[1]],{samples}],{bins}]/samples]//N//Timing

{6.59,{0.10073,0.10007,0.09997,0.09944,0.09995,0.10008,0.09964,0.09939,0.10052}}

Using built in RandomChoice[] is about 3 times faster and has less
program text:

With[{samples=100000},
BinCounts[RandomChoice[weights->bins,samples],{bins}]/samples]//N//Timing

{2.1,{0.1009,0.10192,0.09936,0.10089,0.09974,0.09733,0.09881,0.09986,0.10099}}

Hope that helps,
Peter

```

• Prev by Date: Re: Lauching application from an icon
• Next by Date: Re: Display question
• Previous by thread: RandomReal[] and sampling a distribution
• Next by thread: Display question