Copy until value changes in R

I have a data frame that contains multiple lines, see example.

df <- data.frame(rbind(c('1','CAR','Jan'),
                       c('2','3','4'),
                       c('5','6','7'),
                       c('8','CAR','Feb'),
                       c('9','10', '11'),
                       c('12','13','14')))

      

I would like to copy the value that appears after CAR (January and February) and copy it to the new column X4 until CAR appears again. The number of rows is not always the same between CARs, the number of columns is the same.

The data should look like this:

data.frame(rbind(c('1','CAR','Jan','Jan' ),
                     c('2','3','4','Jan'),
                     c('5','6','7','Jan'),
                     c('8','CAR','Feb','Feb'),
                     c('9','10','11','Feb'),
                     c('11','12','12','Feb')))

      

I've tried different options (ifelse, if, for loop) but none of them gives the correct result.

Do you have any hints on how to solve this? thanks in advance Eric

+3


source to share


3 answers


Here's a uglier, basic version of David's answer:



df$X4 <- unlist(tapply(
  df$X3,
  cumsum(df$X2=="CAR"),
  function(x){y <- levels(x)[x[1]]; rep(y,length(x))}
))

      

+5


source


Here's another solution data.table

library(data.table)
setDT(df)[, X4 := X3[1L], by = cumsum(X2 == "CAR")]
df
#    X1  X2  X3  X4
# 1:  1 CAR Jan Jan
# 2:  2   3   4 Jan
# 3:  5   6   7 Jan
# 4:  8 CAR Feb Feb
# 5:  9  10  11 Feb
# 6: 12  13  14 Feb

      



We can also do a similar thing using dplyr

(but it will add a column as well indx

)

library(dplyr)
df %>% 
  group_by(indx = cumsum(X2 == "CAR")) %>%
  mutate(X4 = X3[1L])

      

+7


source


You may try

library(data.table)
library(zoo)
setDT(df)[X2=='CAR', X4:= X3][, X4:= na.locf(X4)]
#   X1  X2  X3  X4
#1:  1 CAR Jan Jan
#2:  2   3   4 Jan
#3:  5   6   7 Jan
#4:  8 CAR Feb Feb
#5:  9  10  11 Feb
#6: 12  13  14 Feb

      

+6


source







All Articles