Comparing strings in R

I have the following data frame,

    R_Number    A  
    1           0  
    2           15  
    3           10  
    4           11  
    5           12  
    6           18  
    7           19  
    8           15  
    9           17  
    10          11  

      

Now I need to create another column B

where the comparison of values ​​in A

. The condition is that the comparison is not between two consecutive strings , i.e. Row number 1

is compared to Row number 4

, as wise is Row number 2

compared to Row number 5

, and this iteration continues until the end of the data. Comparison condition:

     if (A[1]>=15 && A[4] <= 12) {
     B == 1  
     }
        else if (A[1]<=0 && A[4]>= 10) {
     B== 2 
     }
     else {
      B== 0 
     }

      

When it comes to Row number 8 and Row number 9

, these lines will not compare the next fourth line, so the value should be0

In addition, the comparison result is Row 1 and 4

printed in Row number 1

, similarly the comparison result Row 2 and 5

is printed inRow number 2

Thus, the resulting framework should be as shown below.

    R_Number    A       B  
    1           0       2
    2           15      1
    3           10      0 
    4           11      0
    5           12      0
    6           18      0
    7           19      1
    8           15      0
    9           17      0
    10          11      0

      

+3


source to share


2 answers


The variable is delayed first, and then your new variable is evaluated. Something like that:

library(Hmisc)
df <- data.frame(R_Number = c(1:10), A = c(0,15,10,11,12,18,19,15,17,11))
A_Lag<-Lag(df$A,-3)
df$B <- rowSums(cbind(df$A>=15 & A_Lag <= 12,(df$A<=0 & A_Lag>= 10)*2),na.rm= T)
df$B

      



I tried to avoid operators if

. The function Lag

can be found in the package Hmisc

.

> df$B
 [1] 2 1 0 0 0 0 1 0 0 0

      

+1


source


As per @nicola's comment, I tried to solve your problem. I recreated your original dataframe:

df <- data.frame(R_Number = c(1:10), A = c(0,15,10,11,12,18,19,15,17,11), B = 0)

      

So, I used an if statement inside a loop for:



for (i in 1:(length(df$A)-3)) {
if (df$A[i] >= 15 && df$A[i+3] <= 12) {
  df$B[i] <- 1
  } else if ((df$A[i] <= 0) && (df$A[i+3] >= 10)) {
  df$B[i] <- 2
  }
else {
  df$B[i] <- 0
  }
}

      

On the last edit, I resolved the issue I had when changing the length of the data frame. You now have a general solution!

+2


source







All Articles