Visualize smallest nodes of hierarchical clustering using dendrogram
I am using linkage
to create agglomerative hierarchical clustering for a dataset of about 5000 instances. I want to visualize the "bottom" merges in the hierarchy, that is, the nodes that are close to the leaves with minimal measures.
Unfortunately, the render dendrogram
prefers to display the "top" nodes from the most recent merges in the algorithm. By default, it shows the top 30 nodes, collapsing the bottom of the tree. I can change the value P
to show more nodes, but I would need to show all 5000+ to see the lowest clustering levels at which the graph point is no longer readable.
MCVE
For example, starting with the example linkage
openExample('stats/CompareClusterAssignmentsToClustersExample')
run CompareClusterAssignmentsToClustersExample
dendrogram(Z, 'Orient', 'Left', 'Labels', species);
Produces a dendrogram with the top 30 nodes visible. Numerically labeled nodes collapse the lower levels of the tree.
I can increase the number of visible nodes to include all leaves at the expense of readability.
dendrogram(Z, size(Z,1), 'Orient', 'Left', 'Labels', species);
What I like
I would have liked this to be zoomed in in the version above as shown below, but showing the first 30 nearest clusters.
What i tried
I tried to provide a function using the first 30 lines Z
,
dendrogram(Z(1:30), 'Orient', 'Left');
but that means "The index exceeds the dimensions of the matrix". error when one of the rows refers to a cluster at row> 30.
I've also tried using the dendrogram property Reorder
, but I'm having a hard time finding the correct ordering that orders the clusters from nearest to furthest.
%The Z matrix is in order from closest cluster to furthest,
% so I can use it to create an ordering
Y = reshape(Z(:, 1:2)', 1, [])
Y = Y(Y<151);
dendrogram(Z, 30, 'Orient', 'Left', 'Labels', species, 'Reorder', Y);
I am getting the error
In the requested order of nodes, some data points belonging to the same leaf on the graph are separated by points belonging to other leaves. Try using a different order.
Perhaps this order is not possible if the entire tree is computed because there will be branch intersections, but I hope there is a better order if I only look at a part of the tree and clusters at higher levels are not considered.
Question
How can I improve my rendering to show the lowest level clusters in the dendrogram?
source to share
No one has answered this question yet
Check out similar questions: