How to report APA style Bayesian linear (mixed) models using rstanarm?
I am currently struggling with how to report following the APA-6 guidelines, the conclusion rstanarm::stan_lmer()
.
First, I would put the mixed model in a frequency-based approach and then try to do the same using a Bayesian framework.
Here's the reproducible code to get the data:
library(tidyverse)
library(neuropsychology)
library(rstanarm)
library(lmerTest)
df <- neuropsychology::personality %>%
select(Study_Level, Sex, Negative_Affect) %>%
mutate(Study_Level=as.factor(Study_Level),
Negative_Affect=scale(Negative_Affect)) # I understood that scaling variables is important
Now we come to a linear mixed model for the "traditional" way of testing the influence of Sex (male / female) on the negative influence (negative mood) with the level of education (years of study) as a random factor.
fit <- lmer(Negative_Affect ~ Sex + (1|Study_Level), df)
summary(fit)
The conclusion is as follows:
Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of
freedom [lmerMod]
Formula: Negative_Affect ~ Sex + (1 | Study_Level)
Data: df
REML criterion at convergence: 3709
Scaled residuals:
Min 1Q Median 3Q Max
-2.58199 -0.72973 0.02254 0.68668 2.92841
Random effects:
Groups Name Variance Std.Dev.
Study_Level (Intercept) 0.04096 0.2024
Residual 0.94555 0.9724
Number of obs: 1327, groups: Study_Level, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.01564 0.08908 4.70000 0.176 0.868
SexM -0.46667 0.06607 1321.20000 -7.064 2.62e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
SexM -0.149
To report this, I would say that “we set up a linear mixed model with negative impact as an outcome variable, sex as a predictor, and study rate entered as a random effect. In this model, male level resulted in a significant reduction in negative affect (beta = -0.47, t (1321) = - 7.06, p <0.001).
Is it correct?
Then try to fit the model in a Bayesian structure with rstanarm
:
fitB <- stan_lmer(Negative_Affect ~ Sex + (1|Study_Level),
data=df,
prior=normal(location=0, scale=1),
prior_intercept=normal(location=0, scale=1),
prior_PD=F)
print(fitB, digits=2)
This returns:
stan_lmer
family: gaussian [identity]
formula: Negative_Affect ~ Sex + (1 | Study_Level)
------
Estimates:
Median MAD_SD
(Intercept) 0.02 0.10
SexM -0.47 0.07
sigma 0.97 0.02
Error terms:
Groups Name Std.Dev.
Study_Level (Intercept) 0.278
Residual 0.973
Num. levels: Study_Level 8
Sample avg. posterior predictive
distribution of y (X = xbar):
Median MAD_SD
mean_PPD 0.00 0.04
------
For info on the priors used see help('prior_summary.stanreg').
I think it median
is the median of the posterior distribution of the coefficient and the mad_sd
equivalent of the standard deviation. These parameters are close to beta and standard error of the participant model, which is encouraging. However, I don't know how to formalize and put the output in words.
Also, if I do the model summary ( summary(fitB, probs=c(.025, .975), digits=2)
), I get other back distribution functions:
...
Estimates:
mean sd 2.5% 97.5%
(Intercept) 0.02 0.11 -0.19 0.23
SexM -0.47 0.07 -0.59 -0.34
...
Something like the next good thing?
“we set up a linear mixed model in a Bayesian framework with negative impact as the outcome variable, gender as a predictor, and study level entered as a random effect. The priority for coefficient and intercept were set to normal (mean value = 0, sd = 1). In this model, the features of the posterior distribution of the coefficient associated with the male level indicate a decrease in negative affect (mean = -0.47, sd = 0.11, 95% CI [-0.59, -0.34]).
Thank you for your help.
source to share
The following is a personal opinion that may or may not be acceptable to the journal of psychology.
To report this, I would say that “we set up a linear mixed model with negative impact as an outcome variable, sex as a predictor, and study rate entered as a random effect. In this model, male level resulted in a significant reduction in negative affect (beta = -0.47, t (1321) = - 7.06, p <0.001).
Is it correct?
This is considered correct from the point of view of the sentry.
The key concepts from a Bayesian perspective are that (conventionally for a model, of course)
- There is a 0.5 chance that the true effect is less than the posterior median and a 0.5 chance that the true effect will be greater than the posterior median. Frequent people tend to see the posterior median shape as the numerical optimum.
- The function
posterior_interval
gives reliable intervals around the median with a default probability of 0.9 (although a lower number gives more accurate bounds estimates). So you can legitimately say that there is a 0.9 chance that the true effect is between these bounds. Frequent people tend to see confidence intervals as reliable intervals. - The function
as.data.frame
will give you access to raw gorse, somean(as.data.frame(fitB)$male > 0)
it makes it likely that the expected difference in results between men and women in the same study will be positive. Frequent people tend to think of these probabilities as p-values.
For a Bayesian approach, I would say
We are approaching a linear model using a negative-impact Monte Carlo Markov chain because the outcome variable, gender as a predictor, and intercept were allowed to vary with the level of study.
And then talk about the grades using the three concepts above.
source to share