Efficient Matrix Matrix Operation in R
I have 2 matrices M1, M2. For each row in M1, I want to find the maximum product value of that row in M1 and each row in M2.
I have tried the following implementation which gives the result that I want.
set.seed(1) st_time = Sys.time() M1 = matrix(runif(1000*10), nrow=1000, ncol=10) M2 = matrix(runif(10000*10), nrow=10000, ncol=10) score = apply(M1, 1, function(x){ w = M2 %*% diag(x) row_max = apply(w, 1, max) return(row_max) }) required_output = t(score) Sys.time() - st_time
It takes 16 seconds on my machine. Is there a faster implementation? Thank!
+3
source to share
2 answers
Using a loop for
gives me a pretty high speed
set.seed(1)
M1 = matrix(runif(1000*10), nrow=1000, ncol=10)
M2 = matrix(runif(10000*10), nrow=10000, ncol=10)
st_time = Sys.time()
tm = t(M2)
out = matrix(0, nr=nrow(M1), nc=nrow(M2))
for(i in 1:nrow(M1)){
out[i, ] = matrixStats::colMaxs(M1[i, ]* tm)
}
Sys.time() - st_time
#Time difference of 1.835793 secs # was ~28secs with yours on my laptop
all.equal(required_output, out)
+2
source to share
Parallel operation gives lighter speed. On my machine, the serial version is 15 seconds, the parallel version is just under 4 seconds.
Download the package
# Comes with R
library(parallel)
# Make the cluster
# 8 cores, see detectCores()
cl = makeCluster(8)
Then we need to explicitly export M2
clusterExport(cl, "M2")
and proceed as usual
score_par = function() {
parApply(cl, M1, 1, function(x){
w = M2 %*% diag(x)
row_max = apply(w, 1, max)
return(row_max)
})
}
system.time(score_par())
+2
source to share