MathGroup Archive: May 2008 [00826]

[Date Index] [Thread Index] [Author Index]

Re: Re: How to plot a graph, using a distance matrix

To: mathgroup at smc.vnet.net
Subject: [mg89089] Re: [mg89035] Re: How to plot a graph, using a distance matrix
From: danl at wolfram.com
Date: Sun, 25 May 2008 03:23:51 -0400 (EDT)
References: <g15qni$pdj$1@smc.vnet.net>

> On May 23, 12:11 am, Eric Heiman <ehei... at mailinator.com> wrote:
>> My dilemna is as such:
>>
>> I have a matrix (it happens to be 21x21, but don't worry about that)
>> which contains distances between points.
>> So column 1 has distances from point 1 to every other point, with row 1
>> being 0 (because the distance to itself is zero).
>>
>> What I am wondering is how I would be able to get mathematica to plot a
>> graph of these points.
>>
>> Thanks in advance!
>
> First the algebra:
>
> Let C = (-1/2)(I - uu'/n)B(I - uu'/n), where
> B is the matrix of squared distances among the n points,
> I is the identity matrix,
> u is a column vector whose elements are all 1s,
> and ' denotes transposition.
>
> If the points lie in an m-dimensional space
> then C will be positive semidefinite, with rank m.
> Let F be any factoring of such a C, so that FF' = C.
> Then F contains the coordinates of the points.

At this step (and pulling a rabbit out of a hat), one could do:

ff = Optimization`ModifiedCholeskyDecomposition[C];
ff2 = ff[[1]][[ff[[2]]]]

Now take Chop[ff2] as point coordinates.

> This is a well-known result in multidimensional scaling.
>
> In Mathematica code, if dis is the matrix of distances, let
>
> {vals,vex} = Eigensystem[-.5 (#-Mean@#&) /@
>              Transpose[#-Mean@#& /@ (dis^2)]];
>
> The first m elements of vals should be positive;
> the rest should be approximate zeros. Then
>
> f = Transpose[Sqrt[Take[vals,m]]*Take[vex,m]]
>
> will contain the coordinates of the points. However, this assumes
> that the eigenvectors are orthonormal, which is usually true, but I
> have not been able to find it in the documentation, so caveat user.

As one might expect, both methods break down, in a sense, when any triple
of the distances fails to satisfy triangle inequality. That is to say, the
methods work fine but since there is no real valued point coordinates that
work in such cases, they give results that fail to work, but in different
ways.

Here is a simple setup to test this. We use three points. Note that one
distance depends on a parameter, k. For the given setting the triad will
satisfy triangle inequality.

k = 3/2;
dis = {{0, 1, 2}, {1, 0, k}, {2, k, 0}};

n = 3;
ii = IdentityMatrix[n];
bb = dis^2;
u = Table[{1}, {n}];
cc = (-1/2.) (ii - u.Transpose[u]/n).bb.(ii - u.Transpose[u]/n);

ff = Optimization`ModifiedCholeskyDecomposition[cc];
ff2 = ff[[1]][[ff[[2]]]]

{vals, vex} =
  Eigensystem[-.5 (# - Mean@# &) /@
     Transpose[# - Mean@# & /@ (dis^2)]];

m = Length[Select[vals, Chop[#] != 0 &]]
f = Transpose[Sqrt[Take[vals, m]]*Take[vex, m]]

Here one can check that the distances bear some resemblence to those input.

Outer[Norm[Subtract[##]] &, ff2, ff2, 1]
Outer[Norm[Subtract[##]] &, f, f, 1]

Now suppose instead we use k=1/2, or k. Then triangle inequality will
be violated.

With this scenario the eigensystem approach will give complex values. This
is sensible because one can find complex-valued solutions to the system of
algebraic equations that corresponds to the pairwise distances. To see
this correctly one should use the ordinary dot products rather than Norm,
as the latter will use complex conjugation and thus not be faithful to the
algebra.

Map[Chop[Sqrt[#.#]] &, Outer[Subtract, f, f, 1], {2}]

The Cholesky decomposition will, I believe, give a real valued least
squares solution. That is to say, distances computed from the set of point
coordinates will be optimally close to those of the distance matrix.
Optimal, in this setting, is measured by some sum of squares discrepancy;
just don't ask me what squares get summed.

In a different direction, and getting back to the original question,
suppose the "distances" are not meant to be Euclidean. One still might
want a sensible graph layout, in two or three dimensional space. For that
scenario I would suggest GraphPlot, using as (symmetric, weighted) graph
the matrix of distances.

Daniel Lichtblau
Wolfram Research

Prev by Date: Re: Re: Problems with ListPlot

Next by Date: Re: Range of Use of Mathematica

Previous by thread: Re: Re: How to plot a graph, using a distance matrix

Next by thread: Re: The FinancialData Function