Subset / filter in dplyr chain with ggplot2
I would like to slopegraph line by line (no pun intended) this . Ideally, I would like to do it all in the dplyr style chain, but I got caught in the trap when I try to multiply the data to add specific labels geom_text
. Here's an example of a toy:
# make tbl:
df <- tibble(
area = rep(c("Health", "Education"), 6),
sub_area = rep(c("Staff", "Projects", "Activities"), 4),
year = c(rep(2016, 6), rep(2017, 6)),
value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)
# plot:
df %>% filter(area == "Health") %>%
ggplot() +
geom_line(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
geom_point(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
theme_minimal(base_size = 18) +
geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),
aes(x = as.factor(year), y = value,
color = sub_area, label = area), size = 6, hjust = 1)
But it gives me Error in filter_(.data, .dots = lazyeval::lazy_dots(...)) :
object '.' not found
. Using a subset instead dplyr::filter
gives me a similar error. What I found on SO / Google is this question, which raises a slightly different issue.
What is the correct way to subset the data in a chain?
Edit . My reprex is a simplified example, in real work I have one long chain. Mike's comment below works for the first case, but not the second.
source to share
If you wrap your construction code in {...}
, you can use .
to specify exactly where the previously calculated results were inserted:
library(tidyverse) df <- tibble( area = rep(c("Health", "Education"), 6), sub_area = rep(c("Staff", "Projects", "Activities"), 4), year = c(rep(2016, 6), rep(2017, 6)), value = rep(c(15000, 12000, 18000), 4) ) %>% arrange(area) df %>% filter(area == "Health") %>% { ggplot(.) + # add . to specify to insert results here geom_line(aes(x = as.factor(year), y = value, group = sub_area, color = sub_area), size = 2) + geom_point(aes(x = as.factor(year), y = value, group = sub_area, color = sub_area), size = 2) + theme_minimal(base_size = 18) + geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"), # and here aes(x = as.factor(year), y = value, color = sub_area, label = area), size = 6, hjust = 1) }
While this plot is probably not the one you really want, at least it works, so you can edit it.
What Happens: Usually %>%
passes the results of the left-hand side (LHS) into the first parameter of the right-hand side (RHS). However, if you wrap the RHS in curly braces, it %>%
will only pass the results to where you explicitly put it .
. This formulation is useful for nested subprotocols or other complex calls (like ggplot chaining) that would not otherwise be sorted by redirection with .
. For details see help('%>%', 'magrittr')
.
source to share
Record:
geom_text(data = df[df$year == 2016 & df$sub_area == "Activities",],...
instead
geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),...
makes it work, but you still have problems with text position (you should be able to easily find help on SO for this problem).
source to share