computing PDF of transformations of random variables

*To*: mathgroup at smc.vnet.net*Subject*: [mg120165] computing PDF of transformations of random variables*From*: Andrzej Kozlowski <akoz at mimuw.edu.pl>*Date*: Mon, 11 Jul 2011 06:56:26 -0400 (EDT)

I have been looking carefully at the new functions related to probability introduced in version 8. On the whole, I find them a huge enhancement of Mathematica's functionality and a real time saver in many tedious transformations that I have been doing more or less "by hand" until now. However, there are still some "algorithmic" things that are fairly easy to do by hand and yet I can't find a way to do them using the new functionality (I can write my own programs that do what I want and have done so but I am trying to find out if it can be done more simply using these new functions). Well, one such thing that Mathematica seems unable to do (unless I am missing something, which at this point is quite likely) is to find explicitly the pdf of a one to one transformation of a n-dimensional random vector of n-variate random variables, with known density functions. Yet it is well known that the density function is obtained essentially by substitution and multiplication by the jacobian of the transformation. This should be easy for Mathematica to do automatically but it appears to be unable to do so. Here is a classical example. We start by transforming a 2-dimensional random variable of the following kind: D1 = TransformedDistribution[{r*Sin[\[Theta]], r*Cos[\[Theta]]}, {r \[Distributed] RayleighDistribution[1], \[Theta] \[Distributed] UniformDistribution[{0, 2*Pi}]}]; In other words, r and [\[Theta] are independent random variables, with r having PDF[RayleighDistribution[1], x] Piecewise[{{x/E^(x^2/2), x > 0}}, 0] The joint distribution is also easily calculated: PDF[ ProductDistribution[RayleighDistribution[1], UniformDistribution[{0, 2*Pi}]], {x, y}] Piecewise[{{x/(E^(x^2/2)*(2*Pi)), 0 <= y <= 2*Pi && x > 0}}, 0] So, since the jacobian of the transformation = u=r*Sin[\[Theta]],v=r*Cos[\[Theta]] is r, it is easy to see that the joint pdf of u and v is 1/(2Pi) Exp[-(v^2+w^2)/2], for -Infinity<v<Infinity, -Infinity<w<Infinity. That ought to be the pdf of D1. Now, Mathematica can compute the expectation of D1: Expectation[z, z \[Distributed] D1] {0, 0} It can sample D1: RandomVariate[D1, {10}] {{-0.08105334473181992, -0.6021689928854447}, \ {-1.3727642558209814, -0.022855412336997796}, {0.2386588659762092, 1.2513810769441778}, {-0.5753925770868258, -0.22703300315646122}, {-0.9134787777750948, \ 0.33419824268914516}, {-0.8215179202829789, 0.5632699123625521}, {-0.29365791035701794, -0.617190246608638}, {-0.6465635906987145, 1.0430902012150944}, {1.3232633414020958, -1.3387876692780605}, {-0.09878643084100817, -0.9432106449090337}} It can produce a 3d histogram: SmoothHistogram3D[RandomVariate[D1, 10^4]] but it is unable to compute the PDF of D1: pdf = PDF[D1, {x, y}] PDF[ TransformedDistribution[{\[FormalX]1 Sin[\[FormalX]2], \[FormalX]1 \ Cos[\[FormalX]2]}, {\[FormalX]1 \[Distributed] RayleighDistribution[1], \[FormalX]2 \[Distributed] UniformDistribution[{0, 2 \[Pi]}]}], {x, y}] This took no time at all, which seems to mean that Mathematica never even tried to do anything here. But yet if I ask it to compute the CDF it seems to try: CDF[D1, {x, y}] $Aborted Unfortunately this seems to go on forever so I had to Abort. Why does Mathematica appear to try to compute this but not the PDF? Here is another gripe. When Mathematica returns an answer like the one for pdf above, i.e. pdf = PDF[D1, {x, y}] PDF[ TransformedDistribution[{\[FormalX]1 Sin[\[FormalX]2], \[FormalX]1 \ Cos[\[FormalX]2]}, {\[FormalX]1 \[Distributed] RayleighDistribution[1], \[FormalX]2 \[Distributed] UniformDistribution[{0, 2 \[Pi]}]}], {x, y}] it is unclear whether the output is in some sense usable or not. It seems that it is not, in which case it would be better to say so: e.g. "Mathematica could not find an explicit form of the density function" or something of that kind. There is no mention of any such problems in the "possible issues" section of the documentation. So, as it is, one is tempted to try something like this: Plot3D[pdf, {x, -3, 3}, {y, -3, 3}] // Timing This takes a fairly long time to produce an empty graph, so perhaps Mathematica does attempt to find the PDF after all? (Trying to plot an undefined function produces the same empty plot much faster). This illustrates an annoying property of many relatively new functions in Mathematica: often when the input is returned unevaluated one often has no idea if Mathematica failed to find an explicit form of the answer, or whether the input itself was incorrect. For example, in this particular case, evaluating the function with an incorrect number of variables will also return the input: In[267]:= pdf = PDF[D1, {x}] Out[267]= PDF[ TransformedDistribution[{\[FormalX]1*Sin[\[FormalX]2], \[FormalX]1* Cos[\[FormalX]2]}, {\[FormalX]1 \[Distributed] RayleighDistribution[1], \[FormalX]2 \[Distributed] UniformDistribution[{0, 2*Pi}]}], {x}] Andrzej Kozlowski