Re: Re: Random sampling of an arbitrary distribution
- To: mathgroup at smc.vnet.net
- Subject: [mg59950] Re: [mg59925] Re: Random sampling of an arbitrary distribution
- From: "Owen, HL \(Hywel\)" <h.l.owen at dl.ac.uk>
- Date: Fri, 26 Aug 2005 04:53:41 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
Hi Bill, Thanks for the reply. First off, I should point out that I am a physicist (i.e. small maths brain!) rather than a mathematician (big maths brain). I guess it shows! ;) As you can see, I'm a beginner in the areas of random sampling, and am trying to get to grips with pseudo and quasi-random sequences, and am reading up on low-discrepancy sequences at the moment. My particular interest is in looking at how the choice of a sequence type affects a simulation - I have a simulation which physically has a modulation as below, although I chose the particular function as being a plausible one rather it 'having' to be that one. But it appears to me at the moment that it integrates ok (see below): > > >ModulatedGaussian[x_, mu_, sigma_, modamp_, modlambda_, > modphase_] := > >(1 + modamp*Sin[modphase + (2*Pi*x)/modlambda])/(E^((-mu + > >x)^2/(2*sigma^2))*Sqrt[2*Pi]*sigma) > > Are you aware the expression above cannot be a density > function? An expression for a denisty function must integrate > to 1 over its domain. To see this expression does not satisfy > this in general set modphase to 0 and modlambda to 1. Now > note, the expression can be written as the sum of the pdf for > a normal distribution and a Sin[2 Pi x] times the pdf for a > normal distribution. The latter term is the expected value of > a Sin[2 Pi x] when x has a normal distribution. > > Using mathStatica, > > f = (1/(\[Sigma]*Sqrt[2*Pi]))*Exp[-((x - \[Mu])^2/ > (2*\[Sigma]^2))]; > domain[f] = {x, -Infinity, Infinity} && > {\[Mu] \[Element] Reals, \[Sigma] > 0}; > > Expect[a*Sin[2*Pi*x], f] > > (a*Sin[2*Pi*\[Mu]])/ E^(2*Pi^2*\[Sigma]^2) > > That is the integral over the domain does not evaluate to 1. That's weird, because when I integrate it: Assuming[x $B":(J Reals && mu > 0 && sigma > 0 && modamp > 0 && modlambda > 0 && modphase $B":(J Reals, Integrate[ ModulatedGaussian[x, mu, sigma, modamp, modlambda, modphase], {x, - Infinity, Infinity}]] I get 1, e.g. 1 + (modamp*Sin[modphase + (2*mu*Pi)/modlambda])/E^(( 2*Pi^2*sigma^2)/modlambda^2) /. {modlambda -> 1, modphase -> 0, mu -> 1, sigma -> 1} I accept that you're right, but I don't see why! > Hence the usual methods for generating pseudo random numbers > cannot be relied on in this instance. > > >My original method is: > > >MonteCarloSample[distfn_, distmin_, distmax_] := Module[{a, b}, > >While[(a = Random[Real, {distmin, distmax}]; b = Random[]; b > > >distfn[a])]; a] > > >MonteCarloEnsemble[numberofsamples_, > >distfn_, distmin_, distmax_] := > Table[MonteCarloSample[distfn, distmin, > >distmax], {numberofsamples}] > > >pts1 = MonteCarloEnsemble[10^4, ModulatedGaussian[#, 0.0, 1.0, 1.0, > >0.6, 0] &, -5, 5]; > > Normally, when someone speaks of getting random numbers from > a specified density function they mean the the probability of > getting a number less than a specified value is equal to the > cumulative distribution function evaluated at that specified > value. What you are doing above is distinctly different. You > are selecting random points conditioned on having the value > of a expression (intended to be the density function?) at > that point less than the random another random point. This is > not equivalent to generating random numbers with a specified > density function. Yes, but I thought that since the sampled points match the density function that it was a valid method. Is there a bias or something that isn't visible? > > If all you need is to sample an expression at random points > over its domain then simply doing > > f[Random[Real,{a,b}] will do when a and b are finite. No, that's not what I'm trying to do. I'm trying to create a set of sample points whose density varies according to the density function I'm using. If f is the inverse CDF then your method works - whether f is analytic or an interpolation. I've tried it for the modulated Gaussian case and it appears to work for me, at least when using an Interpolation for the inverse CDF. > OTOH, if you do want random numbers from a specified density > function it is possible to achieve this when the cumulative > distribution function cannot be easily evaluated. But rather > than attempt to explain the possibilities here I stongly > suggest you consult established texts. An excellent starting > point is Knuth's Seminumerical Algorithms Vol 2. Yes, I have it front of me now! My sampling method above appears to me to be a form of von Neumann's rejection method, so I can't see why it isn't valid statistically. To look at the difference between psuedo- and quasi- sequences, I am also trying: QuasiRandomMonteCarloEnsemble[numberofsamples_,distfn_,distmin_,distmax_]:= Module[{a,b,samples={},sequencenumber}, For[sequencenumber=1,Length[samples]<numberofsamples,sequencenumber++, {a,b}=Halton[sequencenumber,2]; If[b>distfn[distmin+(distmax-distmin)a],samples, AppendTo[samples,distmin+(distmax-distmin)a]]; ]; samples] where Halton[n_, s_] := vanderCorput[n, Prime[Range[s]]]; and vanderCorput[n_, b_] := IntegerDigits[n,b].(b^Range[-Floor[Log[b,n]+1],-1]); Attributes[vanderCorput] = Listable; I think this produces values which are distributed in x with more uniformity (lower discrepancy) than the Psuedo case, but I'm happy to be shown that it's wrong by a better example! ;) Of course, I can then replace the sequence with another one, e.g. Faure etc. > > A couple of final thoughts. Creating a psuedo random number > generator with the desired properties is anything but a > trivial task. It is extremely easy to create a very bad > generator. Bad meaning the numbers do generated have few if > any of the desired properties. Knuth discusses several > examples of bad generators. Absolutely! It's a good book. Hywel