Re: Derivative of Dot[]

• To: mathgroup at smc.vnet.net
• Subject: [mg91106] Re: [mg91055] Derivative of Dot[]
• From: Andrzej Kozlowski <akoz at mimuw.edu.pl>
• Date: Wed, 6 Aug 2008 05:06:18 -0400 (EDT)
• References: <200808050759.DAA09628@smc.vnet.net> <44ED75D6-E949-487E-B1B4-911269BFBFE7@mimuw.edu.pl> <7e09277c0808051107s6d27fddcuccdf7a22b4d0eff1@mail.gmail.com>

```I think the rule for D[X[t] . Y[t], t] is necessary, rather than
"useful". Compare these two

In[1]:= D[Dot[X[t] , Y[t]], t]
Out[1]= X[t] . Derivative[1][Y][t] + Derivative[1][X][t] . Y[t]

In[2]:= D[dot[X[t], Y[t]], t]
Out[2]= Derivative[1][Y][t]*Derivative[0, 1][dot][X[t],
Y[t]] + Derivative[1][X][t]*Derivative[1, 0][dot][
X[t], Y[t]]

If there was no built-in rule for D and Dot, then exactly the same
thing that happens for dot would happen for Dot, which would be
somewhat less convenient (you would need to define Derivative[1, 0]
[Dot] and Derivative[0, 1][Dot] yourself).
Other than that, I do not know of any obvious application. Note
however, that Derivative is always first converted to D[ ], so any
rules that are applied to Derivative and Dot are actually derived from
rules for D and Dot. I don't see any direct use for them and actually
I think they are basically an slightly unfortunate side effect.

Andrzej Kozlowski

On 5 Aug 2008, at 20:07, Eitan Grinspun wrote:

> I agree with what you write, and I am aware of this. But: can you show
> me one actually effective use of Derivative applied to Dot (anything
> of your choice). In particular, I mean one where the arguments are not
> explicitly labeled, i.e., using Derivative or prime, but not using D.
>
> I cannot find one useful example, the way it is set up.
>
> Sincerely,
>
> Eitan
>
> On Tue, Aug 5, 2008 at 11:53 AM, Andrzej Kozlowski
> <akoz at mimuw.edu.pl> wrote:
>>
>> On 5 Aug 2008, at 09:59, Eitan Grinspun wrote:
>>
>>> I would like to compute the gradient F' of a scalar-valued
>>> function F
>>> of a vector-valued argument, without knowing a-priori the dimensions
>>> of the vector.
>>>
>>> I am having some trouble with a very simple case.
>>>
>>> Consider the following function:
>>>
>>> F[x_] := Dot[x,x]
>>>
>>> Evaluating this function works as expected: F[{1,2}] evaluates to 5.
>>>
>>> I differentiate this function w.r.t. its sole argument, F' evaluates
>>> to 1.#1+#1.1&
>>>
>>> This is reasonable, and as expected. I would think that, since the
>>> argument of Dot must be a list, the derivative of Dot would have
>>> been
>>> designed to return something that is useful. While I presume that
>>> this
>>> is the case, I have been unable to move ahead.
>>>
>>> I evaluate the derivative at {1,2}: F'[{1, 2}] returns 1.
>>> {1,2}+{1,2}.1
>>>
>>> The Dot is not defined for scalar arguments, and therefore Simplify
>>> does not reduce this further. I could of course program a rule so
>>> that
>>> Dot[1,x_]->x, but my intent here is to understand why the derivative
>>> of Dot was designed the way it was---presumably there is a reason,
>>> and
>>> there is a proper way to make use of the derivative.
>>>
>>> Once I have the derivative, I should be able to contract it with a
>>> (tangent) vector to obtain the (linearized) change the (scalar)
>>> function value:
>>>
>>> F'[{1, 2}].{3,4}
>>>
>>> Alas, this returns (1.{1,2}+{1,2}.1).{3,4} which does not simplify
>>> (even after Distribute) because Dot does not operate on scalar
>>> arguments.
>>>
>>> I'd like some help in understanding how to use Derivative with Dot
>>> (it
>>> was evidently designed to be used, or there would not be a rule
>>> built
>>> in).
>>>
>>> Sincerely,
>>>
>>> Eitan
>>>
>>
>>
>> The problem is that you are deriving wrong conclusion (that the
>> formula you
>> get must have some meaning) from a correct premise (that
>> Mathematica has
>> some meaningful buil-in rules for differentiating Dot). The built
>> in rule
>> is:
>>
>> D[X[t] . Y[t], t]
>> X[t] . Derivative[1][Y][t] + Derivative[1][X][t] . Y[t]
>>
>> where of course X[t] and Y[t] should be vectors of the same dimension
>> dependent on t. It is not valid for scalars. But what you are doing
>> in
>> effect is defining:
>>
>> X[t] := t; Y[t_] = t;
>>
>> and then getting
>>
>> D[X[t] . Y[t], t]
>> 1 . t + t . 1
>>
>> which is meaningless since X[t] and Y[t] cannot be scalars.
>>
>>
>> Andrzej Kozlowski
>>
>>

```