Replace / Change values ​​in boolean vector (pattern matching)

The question looks simple, but I didn't understand how it can be done in R. I want to change the boolean vector depending on the patterns of its values. There are two stages of modification:

  • If there is one FALSE in which the values ​​are TRUE, switch it to TRUE.
  • If there are fewer than three consecutive TRUE values, switch them to FALSE.

Everything else should remain as it is. Here's an example:

# input  
x = c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
    FALSE, TRUE, TRUE, FALSE, FALSE, TRUE,  TRUE,  TRUE,  TRUE, FALSE)

# output
xo = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, 
   TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE,  TRUE,  TRUE,  TRUE, FALSE)

      

cbind(x,xo)

is an

          x    xo
 [1,] FALSE FALSE
 [2,]  TRUE FALSE
 [3,] FALSE FALSE
 [4,] FALSE FALSE
 [5,]  TRUE FALSE
 [6,]  TRUE FALSE
 [7,] FALSE FALSE
 [8,] FALSE FALSE
 [9,]  TRUE  TRUE
[10,]  TRUE  TRUE
[11,]  TRUE  TRUE
[12,] FALSE  TRUE
[13,]  TRUE  TRUE
[14,]  TRUE  TRUE
[15,] FALSE FALSE
[16,] FALSE FALSE
[17,]  TRUE  TRUE
[18,]  TRUE  TRUE
[19,]  TRUE  TRUE
[20,]  TRUE  TRUE
[21,] FALSE FALSE

      

I don't want to use a for loop because it is slow and I would have to do a lot of if statements.

Is there a better way to get this to work?

+3


source to share


3 answers


Here's the approach:

#sample data 
x <- c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
    FALSE, TRUE, TRUE, FALSE, FALSE, TRUE,  TRUE,  TRUE,  TRUE, FALSE)

      

First find the indices at which the FALSE values ​​are to be changed to TRUE values, look for the FALSE values ​​that follow and they are followed by TRUE values

tochange <- 
  intersect(
    intersect(
     which(x == FALSE),   # not strictly necessary
     which(diff(x) == 1)  # FALSEs followed by a TRUE
     ),
    which(diff(x) == -1) + 1 # FALSEs that follow a TRUE
    )

      



Change the values

x[tochange] <- TRUE

      

Then find the runs TRUE (and FALSE) that are less than 3 in length and set them to FALSE.

xrle <- rle(x)

xrle$values[xrle$lengths < 3] <-  FALSE

newx <- inverse.rle(xrle) # thanks to Frank for pointing out inverse.rle!

# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
#[10]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
#[19]  TRUE  TRUE FALSE

      

+3


source


You can try rle

(thanks to @Frank for the modification)



xtmp <- inverse.rle(within.list(rle(x),{
    n    <- length(values)
    values[lengths == 1 & !values & ! seq_len(n) %in% c(1,n)] <- TRUE
}))

res <- inverse.rle(within.list(rle(xtmp),
    values[lengths < 3 & values] <- FALSE
))

identical(xo,res) # TRUE

      

+3


source


Try:

make_true <- function(x) {
  string <- paste(as.numeric(x), collapse='')
  ans <- gregexpr('(?=(101))', string, perl=T)
  x[as.numeric(ans[[1]])+1L] <- TRUE
  res <- rle(x)
  res$values[res$lengths < 3] <- FALSE
  inverse.rle(res)
}

      

The function takes advantage of the fact that T and F can be bound to numbers. The search pattern is "101".

+1


source







All Articles