Re: defining averages over unknown PDF
- To: mathgroup at smc.vnet.net
- Subject: [mg131998] Re: defining averages over unknown PDF
- From: Itai Seggev <itais at wolfram.com>
- Date: Tue, 12 Nov 2013 23:37:49 -0500 (EST)
- Delivered-to: l-mathgroup@mail-archive0.wolfram.com
- Delivered-to: l-mathgroup@wolfram.com
- Delivered-to: mathgroup-outx@smc.vnet.net
- Delivered-to: mathgroup-newsendx@smc.vnet.net
- References: <20131105041659.58DD96A05@smc.vnet.net>
I've done some digging into this. The normal way you define custom differentiation rules is by define UpValues for it and Derivative, like f /: Derivative[n_][f_] := (-1)^n f (*f is secrelt Exp[-x]*) This way, all differentiation functions (of which there are D, Dt, Derivative, and several more internal and user-visible) share equally in the rule. When you write your rule in terms of D and differentation av[...], D starts evaluating, the rule fires, and everything is fine. When av is inside another function, the chain rule code fires and says that Derivative[1][f[av]] := Derivative[1][f][av][#] * Derivative[1][av][#]& Then this code looks up any rules for Derivative[1][av], finds none, and procedes to use automatic differentiation, getting you back to your initial state. I'm afraid I don't have a really great solution for you. D is really intended to differentiate functions, but av is an operator in disguies which makes this quite difficult to work with. Getting a good operator calculus into the system is a problem I and others have thought about, but it is a long-term project which will not be solved in the short term. Sorry! On Wed, Nov 06, 2013 at 12:33:43AM -0500, Sune Jespersen wrote: > Hi Itai > > Thanks for your reply. Yes, you are of course right and I realized the same thing shortly after my post. In fact I implemented a solution quite similar to yours, > In[1]:= av /: D[av[f___], x_] := av[D[f, x]] > In[2]:= av[y_ + z_] := av[y] + av[z] > In[3]:= av[c_ y_] := c av[y] /; FreeQ[c, x] > In[4]:= av[c_] := c /; FreeQ[c, x] > In[5]:= D[av[x y], x] > Out[5]= y > In[6]:= D[av[Exp[-x y]], x] > Out[6]= -y av[E^(-x y)] > But it seems that it still has the problem when it needs to apply the chain rule, i.e. > In[9]:= D[Log[av[Exp[-b x]]], b] > Out[9]= -((E^(-b x) x)/av[E^(-b x)]) > instead of > -(av[(E^(-b x) x)]/av[E^(-b x)]) > This seems a bit strange to me, because somehow it must reach a point where it needs to evaluate a derivative, where my rule applies. Perhaps you can offer some insight on this? > > On 5 Nov, 2013, at 18:19 , Itai Seggev <itais at wolfram.com> wrote: > > > On Mon, Nov 04, 2013 at 11:16:59PM -0500, Sune wrote: > >> Dear all. > >> > >> I want to do some symbolic manipulations of an expression involving averages over a stochastic variable with an unknown density. Therefore, I figured I could define a function av, which would correspond to the average over this unknown parameter density function. > >> I did as follows: > >> av[y_ + z_, x_] := av[y, x] + av[z, x]? > >> av[c_ y_, x_] := c av[y, x] /; FreeQ[c, x] > >> av[c_, x_] := c /; FreeQ[c, x] > >> > >> So these are basic properties of the average over the distribution of X. Some things work okay, for example > >> In[52]:= av[Exp[-x y], x]? > >> Out[52]= av[E^(-x y), x] > >> and > >> In[79]:= D[av[-x y, x], x]? > >> Out[79]= -y > >> and > >> In[80]:= D[av[-x y, x], y]? > >> Out[80]= -av[x, x]. > >> > >> However, the most vital part for my calculations does not work: > >> In[81]:= D[av[Exp[-x y], x], y]? > >> Out[81]= -E^(-x y) x > >> > >> It should have been av[-Exp[-x y] x,x]. > >> > >> Any clues to what I'm doing wrong? I'm thinking that I need to specify some rules for differentiation, but I don't know how. But then I'm wondering how come it got the other expressions for differentiation right. > > > > Ahh, the subtle treacheries of partial differentiation. Note that by your > > definition, > > > > In[71]:= av[Exp[-x y] + h, x] - av[Exp[-x y], x] > > > > Out[71]= h > > > > So that > > > > In[72]:= Limit[(av[Exp[-x y] + h, x] - av[Exp[-x y], x])/h, h -> 0] > > > > Out[72]= 1 > > > > So both your "correct" and "incorrect" answers are consistent with the chain > > rule and and the above computation of partial derivatives. So why is D > > computing the partial derivative in such a stupid way? Well, it isn't, at > > least not directly. D correctly computes the partial derivative as > > > > f'[x] * Derivative[1, 0][av][f[x], x] + Derivative[0, 1][av][f[x],x] > > > > But now Derivative helpfully tries compute these partials using pure functions, > > and then your definitions kick in, giving 1 and 0 for the partials. In > > particular, your third definitions means av[#1,#2]& === #1, and you're doomed. > > > > So you want to abort the automatic differentiation rules with your own custom > > rule, which you can do with the following syntax: > > > > av /: D[av[f_, x_], y_] /; x =!= y := av[D[f, y], x] > > > > In[65]:= D[av[Exp[-x y], x], y] > > > > Out[65]= -av[E^(-x y) x, x] > > > > -- > > Itai Seggev > > Mathematica Algorithms R&D > > 217-398-0700 > > -- Itai Seggev Mathematica Algorithms R&D 217-398-0700
- References:
- defining averages over unknown PDF
- From: Sune <sunenj@gmail.com>
- defining averages over unknown PDF