R lm: dynamically create regressions
I have a set of dependent variables y1, y2, ...
, a set of independent variables, x1,x2,...
and a set of controls d1,d2,...
. All of them are inside data.table
, let's call it data
.
I need to do something in lines
out1 <- lm(y1 ~ x1, data=data)
out2 <- lm(y1 ~ x1 + d1 + d2, data=data)
....
This is of course not very nice, so I was thinking about writing a list containing all these regressions rather than just repeating it. Something along the lines
myRegressions <- list('out1' = y1 ~ x1, 'out2' = y1 ~ x1 + d1 + d2)
output <- NULL
for (reg in myRegressions)
{
output[reg] <- lm(myRegressions[[reg]])
}
This will of course not work: I cannot create the list since the syntax is not valid outside lm()
. What's a good approach here instead?
source to share
Using inline dataframe anscombe
try this:
formulas = list(y1 ~ x1, y2 ~ x2)
lapply(formulas, function(fo) do.call("lm", list(fo, data = quote(anscombe))))
giving:
[[1]]
Call:
lm(formula = y1 ~ x1, data = anscombe)
Coefficients:
(Intercept) x1
3.0001 0.5001
[[2]]
Call:
lm(formula = y2 ~ x2, data = anscombe)
Coefficients:
(Intercept) x2
3.001 0.500
Note that some of the output Call:
is output exactly, which will be useful if there are many components in the output list.
source to share
You can use paste0
and as.formula
to create formulas and then just put them in lm (), e. g.
regressors <- c("x1", "x1 + x2", "x1 + x2 + x3")
for (i in 1:length(regressors)) {
print(as.formula(paste0("y1", "~", regressors[i])))
}
This gives you the formulas (printable). Just save them in a list and swipe through that list with like
lapply(stored_formulas, function(x) { lm(x, data=yourData) })
source to share