MathGroup Archive 2008

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Question - Cluster Analysis

  • To: mathgroup at smc.vnet.net
  • Subject: [mg89827] Re: Question - Cluster Analysis
  • From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
  • Date: Sun, 22 Jun 2008 03:21:39 -0400 (EDT)
  • Organization: The Open University, Milton Keynes, UK
  • References: <g3ihn2$fb6$1@smc.vnet.net>

Steve Mahoney wrote:

> I have a question regarding hierarchical cluster analysis. One measure to determine the number of clusters is the so called elbow criteria, which is just a list of the single distances between the clusters. 
> 
> These values are inside the results from the Agglomerate function, but I am not able to extract them to a list. 
> 
> data = {{1, 2, 2}, {1, 3, 3}, {2, 4, 2}, {5, 4, 3}, {5, 4, 4}, {7, 6, 7}};
> 
> Agglomerate[data]
> 
> Cluster[Cluster[
>   Cluster[Cluster[{1, 2, 2}, {1, 3, 3}, 2, 1, 1], {2, 4, 2}, 3, 2, 1], Cluster[{5, 4, 3}, {5, 4, 4}, 1, 1, 1], 10, 3, 2], {7, 6, 7}, 17, 5, 1]
> 
> --> now I have to extract the values manually
> 
> elbowcriterium = ListLinePlot[{17, 10, 3, 2, 1}]
> 
> --> suggesting three clusters

Hi Steve,

You can extract the relevant information thanks to the function 
*Cases[]*. According to the online help [1],

"Cluster[c1, c2, d, n1, n2]
represents a merger of the clusters c1 and c2 with dissimilarity d and 
n1 and n2 data elements respectively."

Therefore, we look for for expressions with the head Cluster from the 
outermost (0) to the innermost level (Infinity) of the expression 
returned by Agglomerate[] and take their third elements.

     Cases[c, Cluster[c1_, c2_, d_, n1_, n2_] -> d, {0, Infinity}]

For instance,

In[1]:= data = {{1, 2, 2}, {1, 3, 3}, {2, 4, 2}, {5, 4, 3}, {5, 4,
     4}, {7, 6, 7}};
Needs["HierarchicalClustering`"]
c = Agglomerate[data];
elbow = Sort[
   Cases[c, Cluster[c1_, c2_, d_, n1_, n2_] -> d, {0, Infinity}],
   Greater]
elbowcriterium = ListLinePlot[elbow]

Out[4]= {17, 10, 3, 2, 1}

Out[5]= <... graphics deleted ...>

(Note that Sort is needed if you want a decreasing sorted list.)

Regards,
- Jean-Marc

[1] "Cluster", 
http://reference.wolfram.com/mathematica/HierarchicalClustering/ref/Cluster.html


  • Prev by Date: Re: Browsing for a Directory
  • Next by Date: Re: LeafLabels
  • Previous by thread: Question - Cluster Analysis
  • Next by thread: Re: Question - Cluster Analysis