Tukeys post-hoc on ggplot boxplot

Ok, so I think I'm pretty close to this, but I get an error when I try to plot my graph at the end. My goal is to place letters representing the statistical relationship between points in time above each box. I saw two discussions of this on this site and can reproduce the results from my code, but cannot apply it to my dataset.

Packages

library(ggplot2)
library(multcompView)
library(plyr)

      

Here are my details:

dput(WaterConDryMass)
structure(list(ChillTime = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("Pre_chill", 
"6", "13", "24", "Post_chill"), class = "factor"), dmass = c(0.22, 
0.19, 0.34, 0.12, 0.23, 0.33, 0.38, 0.15, 0.31, 0.34, 0.45, 0.48, 
0.59, 0.54, 0.73, 0.69, 0.53, 0.57, 0.39, 0.8)), .Names = c("ChillTime", 
"dmass"), row.names = c(NA, -20L), class = "data.frame")

      

ANOVA and Tukey Post-hoc

Model4 <- aov(dmass~ChillTime, data=WaterConDryMass)
tHSD <- TukeyHSD(Model4, ordered = FALSE, conf.level = 0.95)
plot(tHSD , las=1 , col="brown" )

      

Functions:

generate_label_df <- function(TUKEY, flev){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[flev]][,4]
  Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
  plot.labels <- names(Tukey.labels[['Letters']])

  boxplot.df <- ddply(WaterConDryMass, flev, function (x) max(fivenum(x$y)) + 0.2)

  # Create a data frame out of the factor levels and Tukey homogenous group letters
  plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
                            stringsAsFactors = FALSE) 

  # Merge it with the labels
  labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)
  return(labels.df)
}  

      

Boxplot:

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = generate_label_df(tHSD), aes(x = plot.labels, y = V1, label = labels)) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))

      

Mistake:

Error in HSD[[flev]] : invalid subscript type 'symbol'

      

+3


source to share


1 answer


I think I found the tutorial you are following or something very similar. You are probably best off copying and pasting all of this into your workspace, function and all, to avoid the lack of a few small differences.

I mainly followed the tutorial ( http://www.r-graph-gallery.com/84-tukey-test/ ) and added a few necessary tweaks at the end. It adds a few extra lines of code, but it works.



generate_label_df <- function(TUKEY, variable){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[variable]][,4]
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])

  #I need to put the labels in the same order as in the boxplot :
  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}

model=lm(WaterConDryMass$dmass~WaterConDryMass$ChillTime )
ANOVA=aov(model)

# Tukey test to study each pair of treatment :
TUKEY <- TukeyHSD(x=ANOVA, 'WaterConDryMass$ChillTime', conf.level=0.95)

labels<-generate_label_df(TUKEY , "WaterConDryMass$ChillTime")#generate labels using function

names(labels)<-c('Letters','ChillTime')#rename columns for merging

yvalue<-aggregate(.~ChillTime, data=WaterConDryMass, mean)# obtain letter position for y axis using means

final<-merge(labels,yvalue) #merge dataframes

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = final, aes(x = ChillTime, y = dmass, label = Letters),vjust=-3.5,hjust=-.5) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))

      

enter image description here

+2


source







All Articles