How to get RMSE from lm ​​result?

I know there is a slight difference between $sigma

and the concept of mean squared error . So I'm wondering what is the easiest way to get the RMSE from a function lm

in R ?

res<-lm(randomData$price ~randomData$carat+
                     randomData$cut+randomData$color+
                     randomData$clarity+randomData$depth+
                     randomData$table+randomData$x+
                     randomData$y+randomData$z)

length(coefficients(res))

      

contains 24 coefficients and I can no longer make my model manually. So how can I estimate the RMSE based on the ratios obtained from lm

?

+8


source to share


4 answers


Residual sum of squares:

RSS <- c(crossprod(res$residuals))

      

Mean square error:

MSE <- RSS / length(res$residuals)

      

Root MSE:

RMSE <- sqrt(MSE)

      



Calculated residual Pearson variance (according to summary.lm

):

sig2 <- RSS / res$df.residual

      

Statistically, MSE is an estimator of the maximum probability of residual variance, but biased (downward). Pearson is a limited residual variance maximum likelihood estimate that is unbiased.


Comment

  • Given two vectors x

    and y

    , c(crossprod(x, y))

    equivalent sum(x * y)

    , but much faster . c(crossprod(x))

    also faster than sum(x ^ 2)

    .
  • sum(x) / length(x)

    also faster than mean(x)

    .
+17


source


To get the RMSE in one line using only functions from base

I would use:



sqrt(mean(res$residuals^2))

      

+7


source


I think other answers may be wrong. The MSE of a regression is the SSE divided by (n - k - 1), where n is the number of data points and k is the number of model parameters.

Simply taking the mean square of the residuals (as other answers suggested) is equivalent to dividing by n instead of (n - k - 1).

I would calculate the RMSE by sqrt(sum(res$residuals^2) / res$df)

.

The number in the denominator res$df

gives you a degree of freedom, which is (n - k - 1). Take a look at this for reference: https://www3.nd.edu/~rwilliam/stats2/l02.pdf

+1


source


Just do

sigma(res) 

      

Did you understand

0


source







All Articles