Why is `speedglm` slower than` glm`?

Question

Why is `speedglm` slower than` glm`?

I am trying to use speedglm

to achieve a faster GLM score than glm

, but why is it even slower?

set.seed(0)
n=1e3
p=1e3
x=matrix(runif(n*p),nrow=n)
y=sample(0:1,n,replace = T)

ptm <- proc.time()
fit=glm(y~x,family=binomial())
print(proc.time() - ptm)
#   user  system elapsed 
#  10.71    0.07   10.78 

library(speedglm)
ptm <- proc.time()
fit=speedglm(y~x,family=binomial())
print(proc.time() - ptm)
#   user  system elapsed 
#  15.11    0.12   15.25

+3

performance r regression glm

hxd1011 May 26 '17 at 18:21

source to share

1 answer

李哲源 · Accepted Answer · 2017-05-26T18:34:00+0000

The effectiveness of speedglm

over glm

lies in how it reduces the model matrix n * p

to a matrix p * p

. However, if you have n = p

, there is no effective reduction. What you really want to test is the case n >> p

.

More discernment determines computational complexity, at each iteration of Fisher's estimate.

glm

using QR factorization for matrix n * p

takes 2np^2 - (2/3)p^3

FLOP and speedglm

forms matrix cross product of matrix n * p

followed by QR factorization of matrix p * p

includes np^2 + (4/3)p^3

FLOP. Since n >> p

, speedglm

has only half the amount of calculations glm

. In addition, the blocking caching strategy used speedglm

allows better use of the computer hardware, ensuring high performance.

If you have one n = p

, you will immediately see that glm

accepts (4/3)p^3

FLOP, but speedglm

accepts p^3 + (4/3)p^3

FLOP, more expensive ! Indeed, the matrix cross product in this case becomes an overhead shift!

Why is `speedglm` slower than` glm`?

More articles: