Consecutively specify the same groups in one column in R

I have a data frame with multiple columns I need to rearrange the col2 sequence so that after changing the label from a to b or b to their grouped with a new label, which you can see in the Desired column

testdf <- data.frame(mydate = seq(as.Date('2012-01-01'), 
                                  as.Date('2012-01-10'), by = 'day'),
                     col1 = 1:10,
                     col2 = c("a","a","b","b","a","b","a","b","a","a"),
                     Desired= c(1,1,2,2,3,4,5,6,7,7))

      

     mydate col1 col2 Desired
1 2012-01-01 1 a 1
2 2012-01-02 2 a 1
3 2012-01-03 3 b 2
4 2012-01-04 4 b 2
5 2012-01-05 5 a 3
6 2012-01-06 6 b 4
7 2012-01-07 7 a 5
8 2012-01-08 8 b 6
9 2012-01-09 9 a 7
10 2012-01-10 10 a 7
Are there any ways to solve this problem without FOR loops. because the dataset has over 1 million rows.
+3


source to share


2 answers


You can try this:



output <- c(0,cumsum(diff(as.numeric(testdf$col2))!=0))+1
#> output
#[1] 1 1 2 2 3 4 5 6 7 7

      

+1


source


This is a more efficient way to do it.



testdf %>% group_by(col2) %>% mutate(first = cumsum(as.numeric(col2))

      

+1


source







All Articles