Re: Derivative of Dot[]

*To*: mathgroup at smc.vnet.net*Subject*: [mg91102] Re: [mg91055] Derivative of Dot[]*From*: Andrzej Kozlowski <akoz at mimuw.edu.pl>*Date*: Wed, 6 Aug 2008 05:05:33 -0400 (EDT)*References*: <200808050759.DAA09628@smc.vnet.net>

On 5 Aug 2008, at 09:59, Eitan Grinspun wrote: > I would like to compute the gradient F' of a scalar-valued function F > of a vector-valued argument, without knowing a-priori the dimensions > of the vector. > > I am having some trouble with a very simple case. > > Consider the following function: > > F[x_] := Dot[x,x] > > Evaluating this function works as expected: F[{1,2}] evaluates to 5. > > I differentiate this function w.r.t. its sole argument, F' evaluates > to 1.#1+#1.1& > > This is reasonable, and as expected. I would think that, since the > argument of Dot must be a list, the derivative of Dot would have been > designed to return something that is useful. While I presume that this > is the case, I have been unable to move ahead. > > I evaluate the derivative at {1,2}: F'[{1, 2}] returns 1.{1,2}+{1,2}.1 > > The Dot is not defined for scalar arguments, and therefore Simplify > does not reduce this further. I could of course program a rule so that > Dot[1,x_]->x, but my intent here is to understand why the derivative > of Dot was designed the way it was---presumably there is a reason, and > there is a proper way to make use of the derivative. > > Once I have the derivative, I should be able to contract it with a > (tangent) vector to obtain the (linearized) change the (scalar) > function value: > > F'[{1, 2}].{3,4} > > Alas, this returns (1.{1,2}+{1,2}.1).{3,4} which does not simplify > (even after Distribute) because Dot does not operate on scalar > arguments. > > I'd like some help in understanding how to use Derivative with Dot (it > was evidently designed to be used, or there would not be a rule built > in). > > Sincerely, > > Eitan > The problem is that you are deriving wrong conclusion (that the formula you get must have some meaning) from a correct premise (that Mathematica has some meaningful buil-in rules for differentiating Dot). The built in rule is: D[X[t] . Y[t], t] X[t] . Derivative[1][Y][t] + Derivative[1][X][t] . Y[t] where of course X[t] and Y[t] should be vectors of the same dimension dependent on t. It is not valid for scalars. But what you are doing in effect is defining: X[t] := t; Y[t_] = t; and then getting D[X[t] . Y[t], t] 1 . t + t . 1 which is meaningless since X[t] and Y[t] cannot be scalars. Andrzej Kozlowski

**References**:**Derivative of Dot[]***From:*"Eitan Grinspun" <eitan@grinspun.com>