Labels for boxes in ggplot2 boxplot
I would like to have a label displayed above each field in the field I created ggplot2
.
For example:
#Example data
test = c("A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B")
patient = c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3)
result = c(5, 7, 2 ,4, 6, 7, 3, 5, 5, 6, 2 ,3)
data <- tibble(test, patient, result)
#Labels I want to include
Alabs = c(1, 3, 500)
Blabs = c(8, 16, -32)
#Plot data
ggplot(data, aes(x = factor(patient), y = result, color = factor(test))) +
geom_boxplot(outlier.shape = 1)
Gives a graph:
I would like to print the first item Alabs
above the red box for the first patient, the second item Alabs
above the red box for the second patient, the first item Blabs
above the blue box for the first patient, etc.
How to do it?
source to share
I would make a separate label dataset for adding labels.
labs = tibble(test = rep(LETTERS[1:2], each = 3),
patient = c(1, 2, 3, 1, 2, 3),
labels = c(1, 3, 500, 8, 16, -32) )
test patient labels
<chr> <dbl> <dbl>
1 A 1 1
2 A 2 3
3 A 3 500
4 B 1 8
5 B 2 16
6 B 3 -32
The above information contains all the information about the x-axis and variable cut. It lacks information about the location of the text along the y-axis. To put these above the margins, we could compute the max for each combination of coefficients plus a small value for the y position (while it geom_text
has a useful argument nudge_y
, it doesn't work during evasion).
I do a summary for each group via dplyr and then concatenate the y-position values ββto the label dataset.
library(dplyr) labeldat = data %>% group_by(test, patient) %>% summarize(ypos = max(result) + .25 ) %>% inner_join(., labs)
You can now add a layer geom_text
using the label dataset. To dodge them just like using crates position_dodge
. To prevent the letters from showing up in the legend, I use show.legend = FALSE
.
ggplot(data, aes(x = factor(patient), y = result, color = test)) +
geom_boxplot(outlier.shape = 1) +
geom_text(data = labeldat, aes(label = labels, y = ypos),
position = position_dodge(width = .75),
show.legend = FALSE )
source to share
Does some scam to get shortcuts into the same count:
data$labs=c(NA, 1, NA, 3, NA, 500, NA, 8, NA, 16, NA, -32) #line up the labels so each patient gets one: if you put the NAs first, labels will be at the bottom of the boxes
data$lab_x=c(NA, 0.75, NA, 1.75, NA, 2.75, NA, 1.25, NA, 2.25, NA, 3.25) #set x position for each one
Then run ggplot
:
ggplot(data, aes(x = factor(patient), y = result, color = factor(test))) +
geom_boxplot(outlier.shape = 1)+
geom_text(aes(label=labs, x=lab_x))
source to share