Trying to avoid looping with sapply (for gsub)
Trying to avoid using a loop for
in the following code using sapply
if at all possible. The loop solution works great for me, I'm just trying to learn more R and learn as many methods as possible.
Purpose: have a vector i
and two vectors sf
(search) and rp
(replace). For each, i
it is necessary to loop over sf
and replace with rp
where appropriate.
i = c("1 6 5 4","7 4 3 1")
sf = c("1","2","3")
rp = c("one","two","three")
funn <- function(i) {
for (j in seq_along(sf)) i = gsub(sf[j],rp[j],i,fixed=T)
return(i)
}
print(funn(i))
Result (correct):
[1] "one 6 5 4" "7 4 three one"
I would like to do the same, but with sapply
#Trying to avoid a for loop in a fun
#funn1 <- function(i) {
# i = gsub(sf,rp,i,fixed=T)
# return(i)
#}
#print(sapply(i,funn1))
Apparently the above code won't work as I can only get the first item sf
. This is my first time using it sapply
, so I'm not really sure how to convert the "inner" implicit loop into a vectorial solution. Any help (even the expression is not possible) is appreciated!
(I know mgsub
, but this is not a solution here. I would like to keep gsub
)
EDIT: complete code with packages and below solutions and timelines:
#timing
library(microbenchmark)
library(functional)
i = rep(c("1 6 5 4","7 4 3 1"),10000)
sf = rep(c("1","2","3"),100)
rp = rep(c("one","two","three"),100)
#Loop
funn <- function(i) {
for (j in seq_along(sf)) i = gsub(sf[j],rp[j],i,fixed=T)
return(i)
}
t1 = proc.time()
k = funn(i)
t2 = proc.time()
#print(k)
print(microbenchmark(funn(i),times=10))
#mapply
t3 = proc.time()
mapply(function(u,v) i<<-gsub(u,v,i), sf, rp)
t4 = proc.time()
#print(i)
print(microbenchmark(mapply(function(u,v) i<<-gsub(u,v,i), sf, rp),times=10))
#Curry
t5 = proc.time()
Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i)
t6 = proc.time()
print(microbenchmark(Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i), times=10))
#4th option
n <- length(sf)
sf <- setNames(sf,1:n)
rp <- setNames(rp,1:n)
t7 = proc.time()
Reduce(function(x,j) gsub(sf[j],rp[j],x,fixed=TRUE),c(list(i),as.list(1:n)))
t8 = proc.time()
print(microbenchmark(Reduce(function(x,j) gsub(sf[j],rp[j],x,fixed=TRUE),c(list(i),as.list(1:n))),times=10))
#Usual proc.time
print(t2-t1)
print(t4-t3)
print(t6-t5)
print(t8-t7)
Time:
Unit: milliseconds
expr min lq mean median uq max neval
funn(i) 143 143 149 145 147 165 10
Unit: seconds
expr min lq mean median uq max neval
mapply(function(u, v) i <<- gsub(u, v, i), sf, rp) 4.1 4.2 4.4 4.3 4.4 4.9 10
Unit: seconds
expr min lq mean median uq max neval
Reduce(Compose, Map(function(u, v) Curry(gsub, pattern = u, replacement = v), sf, rp))(i) 1.6 1.6 1.7 1.7 1.7 1.7 10
Unit: milliseconds
expr min lq mean median uq max neval
Reduce(function(x, j) gsub(sf[j], rp[j], x, fixed = TRUE), c(list(i), as.list(1:n))) 141 144 147 145 146 162 10
user system elapsed
0.15 0.00 0.15
user system elapsed
4.49 0.03 4.52
user system elapsed
1.68 0.02 1.68
user system elapsed
0.19 0.00 0.18
So, indeed, in this case, the loop for
suggests the best timing and, in my opinion, the simplest, simplest, and possibly elegant. Bonding to the hinge.
Thanks everyone. All proposals are accepted and approved.
source to share
One approach is an advantage - conciseness, but clearly not functional oriented programming - as it has a border effect on change i
:
mapply(function(u,v) i<<-gsub(u,v,i), sf, rp)
#> i
#[1] "one 6 5 4" "7 4 three one"
Or is this a simple approach to functional programming:
library(functional)
Reduce(Compose, Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp))(i)
#[1] "one 6 5 4" "7 4 three one"
What happens is that it Map(function(u,v) Curry(gsub, pattern=u, replacement=v), sf, rp)
creates a list of functions that are respectively replaced 1
by one
, 2
by two
, etc. These functions are then added and applied to i
, giving the desired result.
source to share