Re: Question - Cluster Analysis
- To: mathgroup at smc.vnet.net
- Subject: [mg89827] Re: Question - Cluster Analysis
- From: Jean-Marc Gulliet <jeanmarc.gulliet at gmail.com>
- Date: Sun, 22 Jun 2008 03:21:39 -0400 (EDT)
- Organization: The Open University, Milton Keynes, UK
- References: <g3ihn2$fb6$1@smc.vnet.net>
Steve Mahoney wrote: > I have a question regarding hierarchical cluster analysis. One measure to determine the number of clusters is the so called elbow criteria, which is just a list of the single distances between the clusters. > > These values are inside the results from the Agglomerate function, but I am not able to extract them to a list. > > data = {{1, 2, 2}, {1, 3, 3}, {2, 4, 2}, {5, 4, 3}, {5, 4, 4}, {7, 6, 7}}; > > Agglomerate[data] > > Cluster[Cluster[ > Cluster[Cluster[{1, 2, 2}, {1, 3, 3}, 2, 1, 1], {2, 4, 2}, 3, 2, 1], Cluster[{5, 4, 3}, {5, 4, 4}, 1, 1, 1], 10, 3, 2], {7, 6, 7}, 17, 5, 1] > > --> now I have to extract the values manually > > elbowcriterium = ListLinePlot[{17, 10, 3, 2, 1}] > > --> suggesting three clusters Hi Steve, You can extract the relevant information thanks to the function *Cases[]*. According to the online help [1], "Cluster[c1, c2, d, n1, n2] represents a merger of the clusters c1 and c2 with dissimilarity d and n1 and n2 data elements respectively." Therefore, we look for for expressions with the head Cluster from the outermost (0) to the innermost level (Infinity) of the expression returned by Agglomerate[] and take their third elements. Cases[c, Cluster[c1_, c2_, d_, n1_, n2_] -> d, {0, Infinity}] For instance, In[1]:= data = {{1, 2, 2}, {1, 3, 3}, {2, 4, 2}, {5, 4, 3}, {5, 4, 4}, {7, 6, 7}}; Needs["HierarchicalClustering`"] c = Agglomerate[data]; elbow = Sort[ Cases[c, Cluster[c1_, c2_, d_, n1_, n2_] -> d, {0, Infinity}], Greater] elbowcriterium = ListLinePlot[elbow] Out[4]= {17, 10, 3, 2, 1} Out[5]= <... graphics deleted ...> (Note that Sort is needed if you want a decreasing sorted list.) Regards, - Jean-Marc [1] "Cluster", http://reference.wolfram.com/mathematica/HierarchicalClustering/ref/Cluster.html