Conditional and grouped dplyr mutant

Let's say I have the following data on the increase in sock per drawer

>socks
year  drawer_nbr  sock_total
1990    1           2
1991    1           2
1990    2           3
1991    2           4
1990    3           2
1991    3           1

      

I would like to have a binary variable that identifies if the socks have grown in each drawer. 1 if they have increased and 0 if not. The result will be

>socks
drawer_nbr  growth
  <dbl>     <factor>
    1          0  
    2          1
    3          0

      

I get hung up on comparing sock_total

one year versus sock_total

another year. I know what I need to use dplyr::summaries()

, but I am having difficulty with what is included in this function.

+3


source to share


2 answers


If you are comparing 1991 to 1990, you can:



socks %>% 
    group_by(drawer_nbr) %>% 
    summarise(growth = +(sock_total[year == 1991] - sock_total[year == 1990] > 0))
# A tibble: 3 x 2
#  drawer_nbr growth
#       <int>  <int>
#1          1      0
#2          2      1
#3          3      0

      

+4


source


You can use a combination of dplyr

and tidyr

:

library(tidyr)
library(dplyr)

socks %>%
  group_by(drawer_nbr) %>% 
  spread(year, sock_total) %>%
  mutate(growth = `1991` - `1990`)

      



Or, if you want the growth to be binary:

socks %>%
  group_by(drawer_nbr) %>% 
  spread(year, sock_total) %>%
  mutate(growth = ifelse((`1991` - `1990`) > 0,
                         1, 0))

      

+1


source







All Articles