Replicating Stata marginlist argument using R margins package?
I cannot reproduce the specific use of the Stata command in R margins
: margins var1, over(var2)
I tried to do this with a package margins
in R.
To give a reproducible example, I used the mtcars dataset and exported it from R to Stata, so we use the same dataset in both programs:
R code:
library(foreign)
library(margins)
write.dta(mtcars, "mtcars.dta")
Stata code:
use "mtcars.dta", clear
Create an example linear regression model in both programs
Stata code:
quietly regress mpg cyl i.am c.wt##c.hp
R code:
x <- lm(mpg ~ cyl + factor(am) + hp * wt, data = mtcars)
Model output (not shown) is identical for the two programs
Compare the mean table of marginal effects for each variable in the model
Stata code and output:
margins, dydx(*)
Average marginal effects Number of obs = 32
Model VCE: OLS
Expression : Linear prediction, predict() dy/dx w.r.t. : cyl 1.am wt hp
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cyl | -.3708001 .5293674 -0.70 0.490 -1.45893 .7173301
1.am | -.0709546 1.374981 -0.05 0.959 -2.897268 2.755359
wt | -3.868994 .9170145 -4.22 0.000 -5.753944 -1.984043
hp | -.0249882 .0120345 -2.08 0.048 -.0497254 -.000251
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
R code and output:
xmarg <- margins(x)
summary(xmarg)
factor AME SE z p lower upper
am1 -0.0710 1.3750 -0.0516 0.9588 -2.7659 2.6240
cyl -0.3708 0.5294 -0.7005 0.4836 -1.4083 0.6667
hp -0.0250 0.0120 -2.0764 0.0379 -0.0486 -0.0014
wt -3.8690 0.9170 -4.2191 0.0000 -5.6663 -2.0717
As you can see, the two outputs are very similar to each other, as expected when using the R package margins
.
Problem 1: Marginal predictions OVER the value of a variable
Stata code and output:
margins, over(cyl)
Predictive margins Number of obs = 32
Model VCE: OLS
Expression : Linear prediction, predict()
over : cyl
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cyl |
4 | 26.56699 .6390379 41.57 0.000 25.25342 27.88055
6 | 20.04662 .5797511 34.58 0.000 18.85492 21.23831
8 | 15.02406 .5718886 26.27 0.000 13.84853 16.19959
------------------------------------------------------------------------------
R code and output:
aggregate(fitted~cyl, data = xmarg, FUN = mean)
cyl fitted
1 4 26.56699
2 6 20.04662
3 8 15.02406
In the two examples above, the marginal prediction is identical for R and Stata. However, is there a way (if not to do it manually) to generate the delta standard error for each limit prediction, as done in the Stata table above?
Problem 2: Limiting predictions for a specific variable:
Stata code and output:
margins am
Predictive margins Number of obs = 32
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
am |
0 | 20.11945 .6819407 29.50 0.000 18.7177 21.5212
1 | 20.0485 .9052764 22.15 0.000 18.18767 21.90932
------------------------------------------------------------------------------
R code and output:
aggregate(fitted~am, data = xmarg, FUN = mean)
am fitted
1 0 17.14737
2 1 24.39231
In this example, we are trying to replicate the Statas "marginlist" argument in the command margins
by subsetting the dataset after prediction. This doesn't sound like the right way. How can we replicate these results from Stata?
Problem 3: Limiting prediction of one variable from the value of another
Reproducing this result is my main goal!
Code Stats and Output
margins am, over(cyl)
Predictive margins Number of obs = 32
Model VCE : OLS
Expression : Linear prediction, predict()
over : cyl
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cyl#am |
4 0 | 26.61859 1.246074 21.36 0.000 24.05725 29.17993
4 1 | 26.54763 .7034599 37.74 0.000 25.10165 27.99362
6 0 | 20.07703 .6449805 31.13 0.000 18.75125 21.4028
6 1 | 20.00607 1.144518 17.48 0.000 17.65348 22.35866
8 0 | 15.0342 .6228319 24.14 0.000 13.75395 16.31445
8 1 | 14.96324 1.257922 11.90 0.000 12.37754 17.54894
------------------------------------------------------------------------------
R code and output:
aggregate(fitted ~ am + cyl, data = xmarg, FUN = mean)
am cyl fitted
1 0 4 22.83306
2 1 4 27.96721
3 0 6 19.06359
4 1 6 21.35732
5 0 8 15.08720
6 1 8 14.64519
As you can see, the point estimates are now significantly different and again there is no SE table. Solving problem 1 and problem 2 above will probably solve problem 3.
source to share
For these problems, you want a forecasting package that is part of the markup . It is not currently possible to get the standard errors for the mean predictions, but you can at least get the mean predictions identical to Stata using the following.
The key intuition of the Stata team margins
is this:
margins x1
equivalent to
margins, at(x1 = (...))
where are ...
all possible values x1
. Any of these expressions creates counterfactual datasets, where it x1
locks at a given value for all cases in the data, and then predicts the model against that temporary counterfactual version of the dataset.
Option over()
is a subsetting procedure:
margins, over(x1)
splits the data based on value x1
and then performs model prediction for each subset. You can combine this with at
but it's a little weird to think about. For example:
margins, over(x1) at(x2 = (1 2))
fixes x2
to 1 for all observations, then splits the data into x1
, then generates predictions for each subset and averages them. He then repeats this for the counterfactual version, where it is x2
set to 2 for all observations.
R prediction::prediction()
will give you equivalents at()
using an argument at
. And it will also give you equivalents over()
by passing subsets of the data to the argument data
.
So for your problem 2 :
> prediction::prediction(x, at = list(am = c(0,1)))
Average predictions for 32 observations:
at(am) value
0 20.12
1 20.05
And for your problem 3 :
> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 4))
Average predictions for 11 observations:
at(am) value
0 26.62
1 26.55
> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 6))
Average predictions for 7 observations:
at(am) value
0 20.08
1 20.01
> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 8))
Average predictions for 14 observations:
at(am) value
0 15.03
1 14.96
In none of these cases, you cannot replicate the Stata output by simply performing a forecast predict(x)
and aggregating the forecasts, because the forecasts are made on counterfactual datasets.
And again, the deviations are not currently implemented (as of August 2018).
source to share
I had the same problem and found the following workaround. The thread is, of course, old. But I thought my solution would be easier to find if added to this thread.
I have modeled the data for the "dv" dependent variable, which is explained by the "level" and "Treat" variables and their interactions.
-
Data modeling
N <- 1000 uid <- rep(1:N) treat <- rep(1:10, each = N/10) level <- rep(1:100, each = N/100) err <- rnorm(N, 0, 1) hdv <- 40 + 2 * treat +.25 * level -.05 * treat * level + err dv <- ifelse(hdv > 47, 1, 0) dat <- data.frame(dv = dv, treat = treat, level = level, hdv = hdv)
-
Advance paynemt
Since the dependent variable is binary, I am evaluating the Logit model. As is well understood, the terms of interaction in Logit (as in any nonlinear model) cannot be directly interpreted. This is why I want the "level" to be minor and not "heal."
logit <- glm(dv ~ treat*level, family = binomial(link = "logit"), data = dat)
- Marginal effects
R can actually reconstruct marginal effects with confidence intervals on a subset of the data, as in
hmpr7 <- summary(margins(logit, variables = "level", data = dat[dat$treat == 7,]))
Below is a (somewhat tricky) way to do it for ALL procedures:
hmpr <- list()
for (i in 1:10) {
hmpr[[i]] <- summary(margins(logit, variables = "level", data = dat[dat$treat == i,]))
}
# the result is a list. For further use it is transformed into a data.frame
mpr <- data.frame(matrix(unlist(hmpr), nrow=length(hmpr), byrow=T))
# in this process, all variables are classified as factors. This is changed here
mpr <- data.frame(lapply(mpr, function(x) as.numeric(as.character(x))))
# only the variables of interest for the graph are kept
mpr <- mpr[,c(2, 6, 7)]
# meaningful names are assigned to the variables
mpr <- setNames(mpr, c("pred", "lower", "upper"))
# treatment classifier is added to rows
mpr$treat <- rep(1:10)
-
outputting the result (as in Stata marginsplot)
'plot (mpr $ pred ~ mpr $ Treat, ylim = range (c (mpr $ lower, mpr $ upper)), pch = 19, xlab = "treatment", ylab = "marginal effect + 95% CI", main = "marginal effect of the level on healing")
arrows (mpr $ Treat, mpr $ lower, mpr $ Treat, mpr $ upper, length = .05, angle = 90, code = 3)
abline (h = 0, col = "red") '
source to share