Replicating Stata marginlist argument using R margins package?

I cannot reproduce the specific use of the Stata command in R margins

: margins var1, over(var2)

I tried to do this with a package margins

in R.

To give a reproducible example, I used the mtcars dataset and exported it from R to Stata, so we use the same dataset in both programs:

R code:

write.dta(mtcars, "mtcars.dta")


Stata code:

use "mtcars.dta", clear


Create an example linear regression model in both programs

Stata code:

quietly regress mpg cyl c.wt##c.hp


R code:

x <- lm(mpg ~ cyl + factor(am) + hp * wt, data = mtcars)


Model output (not shown) is identical for the two programs

Compare the mean table of marginal effects for each variable in the model

Stata code and output:

margins, dydx(*)

Average marginal effects                          Number of obs   =         32
Model VCE: OLS

Expression   : Linear prediction, predict() dy/dx w.r.t. : cyl wt hp

             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
         cyl |  -.3708001   .5293674    -0.70   0.490     -1.45893    .7173301 |  -.0709546   1.374981    -0.05   0.959    -2.897268    2.755359
          wt |  -3.868994   .9170145    -4.22   0.000    -5.753944   -1.984043
          hp |  -.0249882   .0120345    -2.08   0.048    -.0497254    -.000251
Note: dy/dx for factor levels is the discrete change from the base level.


R code and output:

xmarg <- margins(x)

factor     AME     SE       z      p   lower   upper
    am1 -0.0710 1.3750 -0.0516 0.9588 -2.7659  2.6240
    cyl -0.3708 0.5294 -0.7005 0.4836 -1.4083  0.6667
     hp -0.0250 0.0120 -2.0764 0.0379 -0.0486 -0.0014
     wt -3.8690 0.9170 -4.2191 0.0000 -5.6663 -2.0717


As you can see, the two outputs are very similar to each other, as expected when using the R package margins


Problem 1: Marginal predictions OVER the value of a variable

Stata code and output:

margins, over(cyl)

Predictive margins                                Number of obs   =         32
Model VCE: OLS

Expression   : Linear prediction, predict()
over         : cyl

             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
         cyl |
          4  |   26.56699   .6390379    41.57   0.000     25.25342    27.88055
          6  |   20.04662   .5797511    34.58   0.000     18.85492    21.23831
          8  |   15.02406   .5718886    26.27   0.000     13.84853    16.19959


R code and output:

aggregate(fitted~cyl, data = xmarg, FUN = mean)
  cyl   fitted
1   4 26.56699
2   6 20.04662
3   8 15.02406


In the two examples above, the marginal prediction is identical for R and Stata. However, is there a way (if not to do it manually) to generate the delta standard error for each limit prediction, as done in the Stata table above?

Problem 2: Limiting predictions for a specific variable:

Stata code and output:

margins am

Predictive margins                                Number of obs   =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()

             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
          am |
          0  |   20.11945   .6819407    29.50   0.000      18.7177     21.5212
          1  |    20.0485   .9052764    22.15   0.000     18.18767    21.90932


R code and output:

aggregate(fitted~am, data = xmarg, FUN = mean)
  am   fitted
1  0 17.14737
2  1 24.39231


In this example, we are trying to replicate the Statas "marginlist" argument in the command margins

by subsetting the dataset after prediction. This doesn't sound like the right way. How can we replicate these results from Stata?

Problem 3: Limiting prediction of one variable from the value of another

Reproducing this result is my main goal!

Code Stats and Output

margins am, over(cyl)

Predictive margins                                Number of obs   =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()
over         : cyl

             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
      cyl#am |
        4 0  |   26.61859   1.246074    21.36   0.000     24.05725    29.17993
        4 1  |   26.54763   .7034599    37.74   0.000     25.10165    27.99362
        6 0  |   20.07703   .6449805    31.13   0.000     18.75125     21.4028
        6 1  |   20.00607   1.144518    17.48   0.000     17.65348    22.35866
        8 0  |    15.0342   .6228319    24.14   0.000     13.75395    16.31445
        8 1  |   14.96324   1.257922    11.90   0.000     12.37754    17.54894


R code and output:

aggregate(fitted ~ am + cyl, data = xmarg, FUN = mean)
  am cyl   fitted
1  0   4 22.83306
2  1   4 27.96721
3  0   6 19.06359
4  1   6 21.35732
5  0   8 15.08720
6  1   8 14.64519


As you can see, the point estimates are now significantly different and again there is no SE table. Solving problem 1 and problem 2 above will probably solve problem 3.


source to share

2 answers

For these problems, you want a forecasting package that is part of the markup . It is not currently possible to get the standard errors for the mean predictions, but you can at least get the mean predictions identical to Stata using the following.

The key intuition of the Stata team margins

is this:

margins x1


equivalent to

margins, at(x1 = (...))


where are ...

all possible values x1

. Any of these expressions creates counterfactual datasets, where it x1

locks at a given value for all cases in the data, and then predicts the model against that temporary counterfactual version of the dataset.

Option over()

is a subsetting procedure:

margins, over(x1)


splits the data based on value x1

and then performs model prediction for each subset. You can combine this with at

but it's a little weird to think about. For example:

margins, over(x1) at(x2 = (1 2))


fixes x2

to 1 for all observations, then splits the data into x1

, then generates predictions for each subset and averages them. He then repeats this for the counterfactual version, where it is x2

set to 2 for all observations.

R prediction::prediction()

will give you equivalents at()

using an argument at

. And it will also give you equivalents over()

by passing subsets of the data to the argument data


So for your problem 2 :

> prediction::prediction(x, at = list(am = c(0,1)))
Average predictions for 32 observations:
 at(am) value
      0 20.12
      1 20.05


And for your problem 3 :

> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 4))
Average predictions for 11 observations:
 at(am) value
      0 26.62
      1 26.55
> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 6))
Average predictions for 7 observations:
 at(am) value
      0 20.08
      1 20.01
> prediction::prediction(x, at = list(am = c(0,1)), data = subset(mtcars, cyl == 8))
Average predictions for 14 observations:
 at(am) value
      0 15.03
      1 14.96


In none of these cases, you cannot replicate the Stata output by simply performing a forecast predict(x)

and aggregating the forecasts, because the forecasts are made on counterfactual datasets.

And again, the deviations are not currently implemented (as of August 2018).



I had the same problem and found the following workaround. The thread is, of course, old. But I thought my solution would be easier to find if added to this thread.

I have modeled the data for the "dv" dependent variable, which is explained by the "level" and "Treat" variables and their interactions.

  1. Data modeling

    N <- 1000 uid <- rep(1:N) treat <- rep(1:10, each = N/10) level <- rep(1:100, each = N/100) err <- rnorm(N, 0, 1) hdv <- 40 + 2 * treat +.25 * level -.05 * treat * level + err dv <- ifelse(hdv > 47, 1, 0) dat <- data.frame(dv = dv, treat = treat, level = level, hdv = hdv)

  2. Advance paynemt

Since the dependent variable is binary, I am evaluating the Logit model. As is well understood, the terms of interaction in Logit (as in any nonlinear model) cannot be directly interpreted. This is why I want the "level" to be minor and not "heal."

logit <- glm(dv ~ treat*level, family = binomial(link = "logit"), data = dat)


  1. Marginal effects

R can actually reconstruct marginal effects with confidence intervals on a subset of the data, as in

hmpr7 <- summary(margins(logit, variables = "level", data = dat[dat$treat == 7,]))


Below is a (somewhat tricky) way to do it for ALL procedures:

hmpr <- list()
for (i in 1:10) {
  hmpr[[i]] <- summary(margins(logit, variables = "level", data = dat[dat$treat == i,]))
# the result is a list. For further use it is transformed into a data.frame
mpr <- data.frame(matrix(unlist(hmpr), nrow=length(hmpr), byrow=T))
# in this process, all variables are classified as factors. This is changed here
mpr <- data.frame(lapply(mpr, function(x) as.numeric(as.character(x))))
# only the variables of interest for the graph are kept
mpr <- mpr[,c(2, 6, 7)]
# meaningful names are assigned to the variables
mpr <- setNames(mpr, c("pred", "lower", "upper")) 
# treatment classifier is added to rows
mpr$treat <- rep(1:10)


  1. outputting the result (as in Stata marginsplot)

    'plot (mpr $ pred ~ mpr $ Treat, ylim = range (c (mpr $ lower, mpr $ upper)), pch = 19, xlab = "treatment", ylab = "marginal effect + 95% CI", main = "marginal effect of the level on healing")

    arrows (mpr $ Treat, mpr $ lower, mpr $ Treat, mpr $ upper, length = .05, angle = 90, code = 3)

    abline (h = 0, col = "red") '



All Articles