MathGroup Archive: July 2001 [00244]

[Date Index] [Thread Index] [Author Index]
Re: Re: about ConstrainedMin
To: mathgroup at smc.vnet.net
Subject: [mg29888] Re: [mg29855] Re: [mg29806] about ConstrainedMin
From: Daniel Lichtblau <danl at wolfram.com>
Date: Tue, 17 Jul 2001 01:00:33 -0400 (EDT)
References: <200107140536.BAA18344@smc.vnet.net>
Sender: owner-wri-mathgroup at wolfram.com
qing.cheng at icos.be wrote:
> 
> Thank you, Mark.
> 
> But I would like to know what ConstrainedMin inside does, not only the
> usage.
> The problem was rised from one of our applications. There we need to
> measure the position of each pins in a leads component(electronic chip).
> Based on these individual points, we need to calculate a plane, which
> should reflect a physical plane where the component can 'sit' stably, so
> call it seating plane. One way we try to  achieve this is to convert this
> problem to a LP problem. The objective function is to Minimize the Sum
> distance between the measurement points and the plane.
> We have implemented a Simplex method besed on the algorithm in Numerical
> Recipes to solve this linear problem, and found it worked not very well for
> ">=" type constraints. I also brought the same problem to Mathematica, and
> found the situation that BasicSimplex failed as same as our C
> implementation, while ConstrainedMin found good solution. Now, we have done
> a data transformation before pass them to Simplex algorithm to ensure that
> all the constraints are "<=". It works in that way. But still I would like
> to know how ConstrainedMin improved BasicSimplex. (In Mathematica hand book
> from Stephen Wolfram, page 1061 says that ConstrainedMax and related
> function use an enhanced version of the simplex algorithm).
> 
> Could you give me some more information or suggestions about it?
> 
> Best Regards.
> 
> /Qing
> [...]


I do not know the details of the Mathematica implementation of the
simplex algorithm. I will make a few comments about other aspects of
this particular problem, in case any are of relevance e.g. for obtaining
preferable or faster results.

As worded, the problem above appears to be one of total least squares.
That is, the task is to minimize distances from a plane (ordinary least
squares would minimize squares of vertical discrepancies; we'll address
that later).

First note that a plane may be represented by an equation of the form
a*x+b*y+c*z==d. Provided the x-coefficient is not zero we can normalize
so that it is one. We'll work with an example. We take the plane below,
find some points on it, and perturb them. We then hope to recover a
plane with approximately the same parameters.

plane = 3*x + 2*y + z - 10;
npoints = 10;
SeedRandom[1111];
xyvals = 2*Table[Random[], {npoints}, {2}];
zvals = Map[10 - 3*#[[1]] - 2*#[[2]] &, xyvals];
points = Map[Flatten, Transpose[{xyvals, zvals}]];
offsets = Table[.4*Random[] - .2, {npoints}, {3}];
fuzzedpoints = points + offsets;
Map[10 - 3*#[[1]] - 2*#[[2]] - #[[3]] &, fuzzedpoints];

Given the plane a*x+b*y+c*z==d and the point {x0,y0,z0} the square of
the distance from point to plane is given by
(a*x+b*y+c*z-d)^2/(a^2+b^2+c^2). If we want to minimize the sum of these
square distances we may discard the common denominators. We form the
resulting sum as below.

newpoints = Map[Append[#, -1] &, fuzzedpoints];
vec = {1, b, c, d};
dotprods = newpoints.vec;
sumsquares = dotprods.dotprods;

Now we can use FindMinimum to get the best values for the parameters (at
least assuming we can give it reasonable starting values).

In[16]:= bestparams = FindMinimum[sumsquares, {b,1}, {c,1}, {d,1}]
Out[16]= {0.117926, {b->0.654815, c->0.357271, d->3.50226}}

We'll check the result.

In[17]:= bestplane = x + b*y + c*z - d /. bestparams[[2]]
Out[17]= -3.50226 + x + 0.654815 y + 0.357271 z

In[18]:= Expand[3*bestplane]
Out[18]= -10.5068 + 3 x + 1.96445 y + 1.07181 z

We could instead minimize the sum of distances rather than square
distances.

sumdistances = Apply[Plus, Map[Abs, dotprods]];

For this we do not have a differentiable function so I used a secant
method by giving two initial values.

In[26]:= bestparamsl1 = FindMinimum[sumdistances, {b,.5,1}, {c,.5,1},
{d,1,2}]
Out[26]= {0.879003, {b -> 0.636147, c -> 0.34802, d -> 3.42449}}

In[27]:= bestplanel1 = x + b*y + c*z - d /. bestparamsl1[[2]]
Out[27]= -3.42449 + x + 0.636147 y + 0.34802 z

In[28]:= Expand[3*bestplanel1]
Out[28]= -10.2735 + 3 x + 1.90844 y + 1.04406 z

I found that it took some experimentation to make this work; "bad"
initial values gave bad results. Likewise for avoidung Abs and using
instead:

sumdistances = Apply[Plus, Map[Sqrt[#^2]&, dotprods]]

This latter does not have differnetiability problems, hence again does
not need two initial values. But I did not get good results unless I
started reasonably close to the actual solution.

Another scenario is that you might want to treat x and y values as
"exact" and z values as subject to experimental error, and then
determine a best-fitting plane of the form z = a*x+b*y+d. This is
readily cast as an ordinary linear least-squares optimization problem.
First we separate z values from the rest in the data.

vec2 = {a, b, d};
pointsnoz = newpoints[[All, {1, 2, 4}]];
zvals = newpoints[[All, 3]];

pointsnoz can be regarded as a matrix by which we multiply vec2; were
the data to lie exactly on a plane we would have zvals == pointsnoz.vec2
for appropriate values of the parameters in vec2. In the exactly
determined case one can multiply by an inverse matrix to solve for
{a,b,d}. For a least-squares solution, one instead may multiply by the
(Moore-Penrose) pseudo-inverse.

In[43]:= bestparams2 = Thread[vec2 -> PseudoInverse[pointsnoz] . zvals]
Out[43]= {a -> -2.68791, b -> -1.81179, d -> -9.63997}

In[44]:= bestplane2 = z - (a*x + b*y + d) /. bestparams2
Out[44]= 9.63997 + 2.68791 x + 1.81179 y + z

Note that there are ways to do this that avoid explicit computation of
PseudoInverse. In version 4.1 of Mathematica the "further notes" for
QRDecomposition and SingularValues demonstrate some such methods.
Moreover one can formulate the problem as a single call to Fit.

One reason to prefer this second approach is that it is alot faster. If
you have 1000 data points the first method will take several seconds and
the second a split second. This is probably not applicable for the
stated problem (can one work with 1000 pins in a single chip? What would
the connection be like?)

I was initially at a loss to see how this might be formulated as a
linear programming problem. Then one possibility came to mind, that you
constrain the plane so that it lies "under" the data. In this way we get
distances without recourse to Abs and we have a legitimate LP problem.

constraints = Map[# >= 0 &, newpoints.vec];

In[46]:= objfunc = Apply[Plus, newpoints.vec]
Out[46]= 12.9605 + 8.96492 b + 45.3205 c - 10 d

In[47]:= cbestparams = ConstrainedMin[objfunc, constraints, {b, c, d}]
Out[47]= {1.10822, {b -> 0.602984, c -> 0.353536, d -> 3.32804}}

In[48]:= cbestplane = x + b*y + c*z - d /. cbestparams[[2]]
Out[48]= -3.32804 + x + 0.602984 y + 0.353536 z

In[49]:= Expand[3*cbestplane]
Out[49]= -9.98412 + 3 x + 1.80895 y + 1.06061 z

I believe there is a way to formulate the total least squares problem so
that one can use linear algebra techniques e.g. SingularValues. But I do
not know how to do that myself. One advantage that may make such an
approach attractive (other than speed, if you have a truly monstrous
chip) is that these methods are immune to the problems FindMinimum can
have with respect to starting values.


Daniel Lichtblau
Wolfram Research
Follow-Ups:
- non-commuting objects with dummy indices
  - From: K G Zloshchastiev <zlosh@yahoo.com>
References:
- Re: about ConstrainedMin
  - From: qing.cheng@icos.be
Prev by Date: Re: weighting CA neighbors by distance
Next by Date: Integrate[1/x, x] ??
Previous by thread: Re: about ConstrainedMin
Next by thread: non-commuting objects with dummy indices