Replace / Change values ββin boolean vector (pattern matching)
The question looks simple, but I didn't understand how it can be done in R. I want to change the boolean vector depending on the patterns of its values. There are two stages of modification:
- If there is one FALSE in which the values ββare TRUE, switch it to TRUE.
- If there are fewer than three consecutive TRUE values, switch them to FALSE.
Everything else should remain as it is. Here's an example:
# input
x = c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
# output
xo = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
cbind(x,xo)
is an
x xo
[1,] FALSE FALSE
[2,] TRUE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
[5,] TRUE FALSE
[6,] TRUE FALSE
[7,] FALSE FALSE
[8,] FALSE FALSE
[9,] TRUE TRUE
[10,] TRUE TRUE
[11,] TRUE TRUE
[12,] FALSE TRUE
[13,] TRUE TRUE
[14,] TRUE TRUE
[15,] FALSE FALSE
[16,] FALSE FALSE
[17,] TRUE TRUE
[18,] TRUE TRUE
[19,] TRUE TRUE
[20,] TRUE TRUE
[21,] FALSE FALSE
I don't want to use a for loop because it is slow and I would have to do a lot of if statements.
Is there a better way to get this to work?
source to share
Here's the approach:
#sample data
x <- c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
First find the indices at which the FALSE values ββare to be changed to TRUE values, look for the FALSE values ββthat follow and they are followed by TRUE values
tochange <-
intersect(
intersect(
which(x == FALSE), # not strictly necessary
which(diff(x) == 1) # FALSEs followed by a TRUE
),
which(diff(x) == -1) + 1 # FALSEs that follow a TRUE
)
Change the values
x[tochange] <- TRUE
Then find the runs TRUE (and FALSE) that are less than 3 in length and set them to FALSE.
xrle <- rle(x)
xrle$values[xrle$lengths < 3] <- FALSE
newx <- inverse.rle(xrle) # thanks to Frank for pointing out inverse.rle!
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#[10] TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE
#[19] TRUE TRUE FALSE
source to share
You can try rle
(thanks to @Frank for the modification)
xtmp <- inverse.rle(within.list(rle(x),{
n <- length(values)
values[lengths == 1 & !values & ! seq_len(n) %in% c(1,n)] <- TRUE
}))
res <- inverse.rle(within.list(rle(xtmp),
values[lengths < 3 & values] <- FALSE
))
identical(xo,res) # TRUE
source to share
Try:
make_true <- function(x) {
string <- paste(as.numeric(x), collapse='')
ans <- gregexpr('(?=(101))', string, perl=T)
x[as.numeric(ans[[1]])+1L] <- TRUE
res <- rle(x)
res$values[res$lengths < 3] <- FALSE
inverse.rle(res)
}
The function takes advantage of the fact that T and F can be bound to numbers. The search pattern is "101".
source to share