# How to get RMSE from lm result?

I know there is a slight difference between `$sigma`

and the concept of **mean squared error** . So I'm wondering what is the easiest way to get the RMSE from a function `lm`

in **R** ?

```
res<-lm(randomData$price ~randomData$carat+
randomData$cut+randomData$color+
randomData$clarity+randomData$depth+
randomData$table+randomData$x+
randomData$y+randomData$z)
length(coefficients(res))
```

contains 24 coefficients and I can no longer make my model manually. So how can I estimate the RMSE based on the ratios obtained from `lm`

?

source to share

Residual sum of squares:

```
RSS <- c(crossprod(res$residuals))
```

Mean square error:

```
MSE <- RSS / length(res$residuals)
```

Root MSE:

```
RMSE <- sqrt(MSE)
```

Calculated residual Pearson variance (according to `summary.lm`

):

```
sig2 <- RSS / res$df.residual
```

Statistically, MSE is an estimator of the maximum probability of residual variance, but biased (downward). Pearson is a limited residual variance maximum likelihood estimate that is unbiased.

**Comment**

- Given two vectors
`x`

and`y`

,`c(crossprod(x, y))`

equivalent`sum(x * y)`

, but much faster .`c(crossprod(x))`

also faster than`sum(x ^ 2)`

. `sum(x) / length(x)`

also faster than`mean(x)`

.

source to share

I think other answers may be wrong. The MSE of a regression is the SSE divided by (n - k - 1), where n is the number of data points and k is the number of model parameters.

Simply taking the mean square of the residuals (as other answers suggested) is equivalent to dividing by n instead of (n - k - 1).

I would calculate the RMSE by `sqrt(sum(res$residuals^2) / res$df)`

.

The number in the denominator `res$df`

gives you a degree of freedom, which is (n - k - 1). Take a look at this for reference: https://www3.nd.edu/~rwilliam/stats2/l02.pdf

source to share