Subset / filter in dplyr chain with ggplot2

I would like to slopegraph line by line (no pun intended) this . Ideally, I would like to do it all in the dplyr style chain, but I got caught in the trap when I try to multiply the data to add specific labels geom_text

. Here's an example of a toy:

# make tbl:

df <- tibble(
  area = rep(c("Health", "Education"), 6),
  sub_area = rep(c("Staff", "Projects", "Activities"), 4),
  year = c(rep(2016, 6), rep(2017, 6)),
  value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)


# plot: 

df %>% filter(area == "Health") %>% 
  ggplot() + 
  geom_line(aes(x = as.factor(year), y = value, 
            group = sub_area, color = sub_area), size = 2) + 
  geom_point(aes(x = as.factor(year), y = value, 
            group = sub_area, color = sub_area), size = 2) +
  theme_minimal(base_size = 18) + 
  geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"), 
  aes(x = as.factor(year), y = value, 
  color = sub_area, label = area), size = 6, hjust = 1)

      

But it gives me Error in filter_(.data, .dots = lazyeval::lazy_dots(...)) : object '.' not found

. Using a subset instead dplyr::filter

gives me a similar error. What I found on SO / Google is this question, which raises a slightly different issue.

What is the correct way to subset the data in a chain?

Edit . My reprex is a simplified example, in real work I have one long chain. Mike's comment below works for the first case, but not the second.

+3


source to share


2 answers


If you wrap your construction code in {...}

, you can use .

to specify exactly where the previously calculated results were inserted:

library(tidyverse)

df <- tibble(
  area = rep(c("Health", "Education"), 6),
  sub_area = rep(c("Staff", "Projects", "Activities"), 4),
  year = c(rep(2016, 6), rep(2017, 6)),
  value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)

df %>% filter(area == "Health") %>% {
    ggplot(.) +    # add . to specify to insert results here
        geom_line(aes(x = as.factor(year), y = value, 
                      group = sub_area, color = sub_area), size = 2) + 
        geom_point(aes(x = as.factor(year), y = value, 
                       group = sub_area, color = sub_area), size = 2) +
        theme_minimal(base_size = 18) + 
        geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),    # and here
                  aes(x = as.factor(year), y = value, 
                      color = sub_area, label = area), size = 6, hjust = 1)
}

      



While this plot is probably not the one you really want, at least it works, so you can edit it.

What Happens: Usually %>%

passes the results of the left-hand side (LHS) into the first parameter of the right-hand side (RHS). However, if you wrap the RHS in curly braces, it %>%

will only pass the results to where you explicitly put it .

. This formulation is useful for nested subprotocols or other complex calls (like ggplot chaining) that would not otherwise be sorted by redirection with .

. For details see help('%>%', 'magrittr')

.

+4


source


Record:

geom_text(data = df[df$year == 2016 & df$sub_area == "Activities",],...

      

instead



geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),...

      

makes it work, but you still have problems with text position (you should be able to easily find help on SO for this problem).

+3


source







All Articles