How can I arrange error columns in a multi-line plot using geom_errorbar?
I want to stack error lines in a complex histogram using geom_errorbar / ggplot.
In my ggplot application, I tried to use both position="stack"
and position="identity"
. None of them worked.
Here is my ggplot statement:
ggplot(DF, aes(x=factor(year), y=proportion, fill=response)) +
facet_grid(. ~ sex) +
theme(legend.position="none")
geom_bar(position="stack", stat="identity") +
geom_errorbar(aes(ymin=ci_l, ymax=ci_u),
width=.2, # Width of the error bars
position="identity") +
Here is the result I am getting and you can notice that the error lines on the right do not match the column values.
Here's the dataframe I used in this example:
DF <- data.frame(sex=c("men","women","men","women","men","women"),
proportion=c(0.33,0.32,0.24,0.29,0.12,0.16),
ci_l=c(0.325,0.322,0.230,0.284,0.114,0.155),
ci_u=c(0.339,0.316,0.252,0.311,0.130,0.176),
year=c(2008,2008,2013,2013,2013,2013),
response=c("Yes","Yes","Yes, entire the journey","Yes, entire the journey","Yes, part of the journey","Yes, part of the journey")
)
source to share
What's going on here is that it ggplot
doesn't add up the error lines (they would need to be summed up), so you'll have to do it manually ( and it seems that Hadley thinks it's not a good idea and won't add this feature ).
So, do it manually:
DF$ci_l[DF$response == "Yes, part of the journey"] <- with(DF,ci_l[response == "Yes, part of the journey"] +
ci_l[response == "Yes, entire the journey"])
DF$ci_u[DF$response == "Yes, part of the journey"] <- with(DF,ci_u[response == "Yes, part of the journey"] +
ci_u[response == "Yes, entire the journey"])
Now:
ggplot(DF, aes(x=factor(year), y=proportion)) +
facet_grid(. ~ sex) +
geom_bar(stat="identity",aes(fill=response)) +
geom_errorbar(aes(ymin= ci_l,
ymax= ci_u),
width=.2, # Width of the error bars
position="identity")
source to share
The problem here is that it geom_errorbar
just creates nice error lines with the y values you give them; it doesn't know anything about the layer geom_bar
that has a vertical offset for some data. So you need to adjust to the fact that for one of your answers, the plotted values have a positive vertical offset, determined by the value for the other answer. In the example given, this can be done:
DF$vadj <- c(rep(0,2), rep(c(0,1,0), each=2) * DF$proportion)[1:6]
ggplot(DF, aes(x=factor(year), y=proportion, fill=response)) +
facet_grid(. ~ sex) + geom_bar(stat='identity') +
geom_errorbar( aes(ymin=ci_l+vadj, ymax=ci_u+vadj), width=.2)
The setup technique here is admittedly not particularly elegant, and if you need to generalize, keep in mind that it is very dependent on the specific structure of the data block (i.e. it would need to be changed if the strings were ordered differently ). But he needs to get your error bars where you want them.
source to share