MathGroup Archive: December 2011 [00279]

[Date Index] [Thread Index] [Author Index]

Re: Findminimum too slow for iterative reweighted least

To: mathgroup at smc.vnet.net
Subject: [mg123576] Re: Findminimum too slow for iterative reweighted least
From: "Oleksandr Rasputinov" <oleksandr_rasputinov at hmamail.com>
Date: Sun, 11 Dec 2011 03:50:18 -0500 (EST)
Delivered-to: l-mathgroup@mail-archive0.wolfram.com
References: <201112091055.FAA03827@smc.vnet.net> <jbvjq3$j2f$1@smc.vnet.net>

On Sat, 10 Dec 2011 12:37:23 -0000, Oliver Ruebenkoenig  
<ruebenko at wolfram.com> wrote:

>
>
> On Fri, 9 Dec 2011, Alberto Maydeu-Olivares wrote:
>
>> Hi, I'm trying to use FindMinimum to minimize a weighted least squares
>> function where the weight matrix needs to be updated at each
>> iteration. The weight matrix involves the inverse of a symbolic
>> matrix. This should be fast if FindMinimum first evaluates the
>> symbolic matrix, then takes the inverse. Given how long it takes, it's
>> obviously not doing that. I wonder if anyone can suggest a way to
>> force FindMinimum to do that. I obviously tried Evaluate, but does not
>> work. At attach a toy example so that you can try. Finding the minimum
>> of this toy example takes a full second. It takes minutes to do a
>> realistic size job, It should not take more than a couple of seconds.
>>
>> Thanks a lot for your help.
>>
>>
>> model = {l1^2 + ps, l2*l1, l2^2 + ps, l3*l2, l3*l1, l3^2 + ps};
>> theta = {l1, l2, l3, ps};
>> j = Outer[D, model, theta];
>> data = {2., .42, 3., .56, .48, 1.};
>> startval = Transpose[{theta, {.5, .5, .5, .5}}];
>> e = data - model;
>> mat = {{2 (l1^2 + ps)^2, 2 l1 l2 (l1^2 + ps), 2 l1^2 l2^2,
>>    2 l2 l3 (l1^2 + ps), 2 l1 l2^2 l3,
>>    2 l2^2 l3^2}, {2 l1 l2 (l1^2 + ps),
>>    ps (l2^2 + ps) + l1^2 (2 l2^2 + ps), 2 l1 l2 (l2^2 + ps),
>>    l1 l3 (l1^2 + l2^2 + ps), l2 l3 (l1^2 + l2^2 + ps),
>>    2 l1 l2 l3^2}, {2 l1^2 l2^2, 2 l1 l2 (l2^2 + ps), 2 (l2^2 + ps)^2,
>>     2 l1^2 l2 l3, 2 l1 l3 (l2^2 + ps),
>>    2 l1^2 l3^2}, {2 l2 l3 (l1^2 + ps), l1 l3 (l1^2 + l2^2 + ps),
>>    2 l1^2 l2 l3, l2^2 l3^2 + (l1^2 + ps) (l3^2 + ps),
>>    l1 l2 (2 l3^2 + ps), 2 l2 l3 (l3^2 + ps)}, {2 l1 l2^2 l3,
>>    l2 l3 (l1^2 + l2^2 + ps), 2 l1 l3 (l2^2 + ps),
>>    l1 l2 (2 l3^2 + ps), l1^2 l3^2 + (l2^2 + ps) (l3^2 + ps),
>>    2 l1 l3 (l3^2 + ps)}, {2 l2^2 l3^2, 2 l1 l2 l3^2, 2 l1^2 l3^2,
>>    2 l2 l3 (l3^2 + ps), 2 l1 l3 (l3^2 + ps), 2 (l3^2 + ps)^2}};
>>
>> (*brute force approach to benchmark*)
>> w = Inverse[mat];
>> irls = FindMinimum[e. w  . e, Evaluate[Sequence @@ startval], Gradient
>> -> -2 e.   w .j]; // Timing
>>
>> (*this should work, in fact, it takes almost twice as long*)
>> w := Inverse[mat];
>> irls = FindMinimum[Evaluate[e. w  . e],  Evaluate[Sequence @@
>> startval], Gradient -> Evaluate[-2 e.   w .j]]; // Timing
>>
>>
>
> Alberto,
>
> try
>
> ewe = e.w.e;
> ewj = -2 e.w.j;
> irls = FindMinimum[ewe, Evaluate[Sequence @@ startval],
>      Gradient -> ewj]; // AbsoluteTiming
>
> which is somewhat faster. Using Method -> "Newton" gets rid of the
> messages but takes longer. Perhaps a Hessian would help. Or use, say,
> Method -> "ConjugateGradient" which is againn faster but has messages.  
> You
> could use ParallelTry with different methods.
>
> Oliver
>

Well-spotted; I had forgotten to include an expression for -2 e.w.j in my  
reply. However, simplifying w before using it still gives you the biggest  
gain in this example. I'll use compiled functions for e.w.e and -2 e.w.j  
as well:

AbsoluteTiming[
  w = Simplify@Inverse[mat];
  ewe = Compile[Evaluate@Thread[{theta, _Real, 0}], Evaluate[e.w.e]];
  ewj = Compile[Evaluate@Thread[{theta, _Real, 0}], Evaluate[-2 e.w.j]];
  irls = FindMinimum[ewe[Sequence @@ theta],
     Evaluate[Sequence @@ startval],
     Gradient -> ewj[Sequence @@ theta], Method -> "Newton"] //
    AbsoluteTiming
]

{1.1875000,
  {0.0468750,
   {0.174938, {l1 -> 0.455422, l2 -> 1.2666, l3 -> 0.34468, ps ->  
1.34602}}}}

versus (un-simplified, un-compiled):

AbsoluteTiming[
  w = Inverse[mat];
  ewe = e.w.e;
  ewj = -2 e.w.j;
  irls = FindMinimum[ewe, Evaluate[Sequence @@ startval],
     Gradient -> ewj, Method -> "Newton"] // AbsoluteTiming
]

{2.8750000,
  {2.2812500,
   {0.174938, {l1 -> 0.455422, l2 -> 1.2666, l3 -> 0.34468, ps ->  
1.34602}}}}

So simplifying w and compiling the expressions used in FindMinimum reduces  
the time taken for the minimization to practically nothing, while the  
simplification and compilation itself takes less than half the time that  
FindMinimum would have done without it. Of course, if higher than machine  
precision is needed to find the minimum, compilation won't work, but  
simplification can still be useful. (Using compiled functions produces  
messages when FindMinimum attempts to symbolically evaluate the minimand  
and gradient, but these are harmless and can be ignored.)

References:
- Findminimum too slow for iterative reweighted least squares
  - From: Alberto Maydeu-Olivares <amaydeu@gmail.com>

Prev by Date: Re: Precicely defining plot size

Next by Date: Re: Change x-axis scale

Previous by thread: Re: Findminimum too slow for iterative reweighted least

Next by thread: Re: Findminimum too slow for iterative reweighted least squares