Avoid looping in string replacement?

I have data, a character vector (I end up breaking it down, so I don't care if it stays a vector or if it's treated as a single string), a pattern vector, and a replacement vector. I want every pattern in the data to be replaced with an appropriate replacement. I did it with a stringr

and for loop, but is there a more R-shaped way to do this?

require(stringr)
start_string <- sample(letters[1:10], 10)
my_pattern <- c("a", "b", "c", "z")
my_replacement <- c("[this was an a]", "[this was a b]", "[this was a c]", "[no z!]")
str_replace(start_string, pattern = my_pattern, replacement = my_replacement)
# bad lengths, doesn't work

str_replace(paste0(start_string, collapse = ""),
    pattern = my_pattern, replacement = my_replacement)
# vector output, not what I want in this case

my_result <- start_string
for (i in 1:length(my_pattern)) {
    my_result <- str_replace(my_result,
        pattern = my_pattern[i], replacement = my_replacement[i])
}
> my_result
 [1] "[this was a c]"  "[this was an a]" "e"               "g"               "h"               "[this was a b]" 
 [7] "d"               "j"               "f"               "i"   

# This is what I want, but is there a better way?

      

In my case, I know that every pattern will occur at most once, but not every pattern will occur. I know that I could use str_replace_all

if patterns can appear more than once; I hope the solution provides this option as well. I also need a solution that uses my_pattern

and my_replacement

so that it can be part of a function with these vectors as arguments.

+3


source to share


2 answers


I'm willing to argue with this in a different way, but my first thought was gsubfn :

my_repl <- function(x){
    switch(x,a = "[this was an a]",
             b = "[this was a b]",
             c = "[this was a c]",
             z = "[this was a z]")
}

library(gsubfn)    
start_string <- sample(letters[1:10], 10)
gsubfn("a|b|c|z",my_repl,x = start_string)

      

If you are looking for patterns that are acceptably valid names for list items, this will work as well:

names(my_replacement) <- my_pattern
gsubfn("a|b|c|z",as.list(my_replacement),start_string)

      

Edit



But honestly, if I really had to do a lot of things in my own code, I would probably just make a loop for

associated with a function. Here's a simple version using sub

and gsub

rather than functions from stringr :

vsub <- function(pattern,replacement,x,all = TRUE,...){
  FUN <- if (all) gsub else sub
  for (i in seq_len(min(length(pattern),length(replacement)))){
    x <- FUN(pattern = pattern[i],replacement = replacement[i],x,...)
  }
  x
}

vsub(my_pattern,my_replacement,start_string)

      

But, of course, one of the reasons for the lack of a built-in function for this, which is well known, is probably because successive replacements like this cannot be quite fragile, because they are so order-dependent:

vsub(rev(my_pattern),rev(my_replacement),start_string)
 [1] "i"                                          "[this w[this was an a]s [this was an a] c]"
 [3] "[this was an a]"                            "g"                                         
 [5] "j"                                          "d"                                         
 [7] "f"                                          "[this w[this was an a]s [this was an a] b]"
 [9] "h"                                          "e"      

      

+3


source


Here's an option based on gregrexpr

, regmatches

and regmatches<-

. Be aware that there are limits on the length of the regexps that can be matched, so it won't work if you try to match too many long patterns with it.



replaceSubstrings <- function(patterns, replacements, X) {
    pat <- paste(patterns, collapse="|")
    m <- gregexpr(pat, X)
    regmatches(X, m) <- 
        lapply(regmatches(X,m),
               function(XX) replacements[match(XX, patterns)])
    X
}

## Try it out
patterns <- c("cat", "dog")
replacements <- c("tiger", "coyote")
sentences <- c("A cat", "Two dogs", "Raining cats and dogs")
replaceSubstrings(patterns, replacements, sentences)
## [1] "A tiger"                    "Two coyotes"               
## [3] "Raining tigers and coyotes"

      

+1


source







All Articles