Lm () gives different results from lm.ridge (lambda = 0)

I'm looking for ideas before doing anything so presumptuous to contact the core team to tell them they have a bug!

I have a regression from which I extract coefficients like this:

lm(y ~ x1 * x2, data=ds)$coef

      

This gives: x1 = 0.40, x2 = 0.37, x1 * x2 = 0.09

When I do the same regression in SPSS, I get: beta (x1) = 0.40, beta (x2) = 0.37, beta (x1 * x2) = 0.14. So the difference is in the interaction.

X1 and X2 are correlated around 0.75 (yes, yes, I know - this model was not my idea, but it was posted), so it is quite possible that something is happening with collinearity.

Now, of course, we all want to point, laugh, and SPSS, but I thought I would try lm.ridge () to see if I can figure out where the problems are. So, first, run lm.ridge () with lambda = 0 (i.e. not comb) and check that we get the same thing as with lm ():

lm(y ~ x1 * x2, lambda=0, data=ds)$coef

      

x1 = 0.40, x2 = 0.37, x1 * x2 = 0.14

Oh my God. lm.ridge () agrees with SPSS and lm () does not.

What's really weird is that I assumed that lm.ridge is just piggybacks on lm () anyway, so in the specific case where lambda = 0 and there is no ridge, I would expect the same results.

Unfortunately, there are 34,000 cases in the dataset, so it won't be easy to create a minimal reprex.

Any ideas what else I can try? (It doesn't stop me from doing work, but I'm curious right now!)

+3


source to share





All Articles