Show dendrogram node values ​​in R

I am using the hclust function:

points <- data.frame(ID = c('A','B','C','D','E'), 
                 x = c(3,4,2.1,4,7), 
                 y = c(6.1,2,5,6,3))
d <- dist(as.matrix(points[, 2:3])) 
clusters <- hclust(d,method = "complete")
plot(clusters, labels=points$ID)

      

Is there a way to show the values ​​at which the points are connected (or the node values ​​(where the difference between samples is minimal))?

I want my plot to look like the picture.

Note. The values ​​shown on the dendrogram are not correct.

Dendogram

+3


source to share


2 answers


My R package TBEST has a function that can add two color annotations to the hclust object. For your convenience, I am pasting the codes below, so you can use them independently of any packages.

hc2axes<-function (x) {
    A <- x$merge
    n <- nrow(A) + 1
    x.axis <- c()
    y.axis <- x$height
    x.tmp <- rep(0, 2)
    zz <- match(1:length(x$order), x$order)
    for (i in 1:(n - 1)) {
       ai <- A[i, 1]
       if (ai < 0) 
           x.tmp[1] <- zz[-ai]
       else x.tmp[1] <- x.axis[ai]
           ai <- A[i, 2]
       if (ai < 0) 
           x.tmp[2] <- zz[-ai]
       else x.tmp[2] <- x.axis[ai]
           x.axis[i] <- mean(x.tmp)
    }
    return(data.frame(x.axis = x.axis, y.axis = y.axis))
}



plot_height<-function (hc, height, col = c(2, 3), print.num = TRUE, float = 0.01, cex = NULL, font = NULL) 
{
    axes <- hc2axes(hc)
    usr <- par()$usr
    wid <- usr[4] - usr[3]
    bp <- as.character(round(height,2))
    rn <- as.character(1:length(height))
    bp[length(bp)] <- "height"
    rn[length(rn)] <- "edge #"
    a <- text(x = axes[, 1], y = axes[, 2] + float * wid, bp, 
        col = col[1], pos = 2, offset = 0.3, cex = cex, font = font)
    if (print.num) {
        a <- text(x = axes[, 1], y = axes[, 2], rn, col = col[2], 
            pos = 4, offset = 0.3, cex = cex, font = font)
    }
}

      



Once you insert these two functions, add one line to build your dendrogram,    plot(clusters,labels=points$ID);


  cluster_height(clusters,height=clusters$height,print.num=F)

You can also display branch numbers on setting print.num=T

enter image description here

+4


source


Here one method is using a package dendextend

.

Convert to hanging dendrogram first

library(dendextend)
dend <- as.dendrogram(clusters) %>% hang.dendrogram()
dend <- dend %>% set_labels(points$ID[dend %>% labels()])

      

Now we find the x, y values ​​for all internal nodes

xy <- dend %>% get_nodes_xy()
is_internal_node <- is.na(dend %>% get_nodes_attr("leaf"))
is_internal_node[which.max(xy[,2])] <- FALSE
xy <- xy[is_internal_node,]

      



And now we will build a dendrogram and draw labels with a slight offset

plot(dend)
text(xy[,1]+.2, xy[,2]+.2, labels=format(xy[,2], digits=2), col="red")

      

This gives the following graph

enter image description here

+4


source







All Articles