Repeated measures ANOVA: ezANOVA syntax vs aov vs lme
This question is about both syntax and semantics, so please find a (not yet answered) duplicate in Cross-Validated: https://stats.stackexchange.com/questions/113324/repeated-measures-anova-ezanova-vs- aov-vs-lme-syntax
In the machine learning domain, I evaluated 4 classifiers on the same 5 datasets, that is, each classifier returned a performance measurement for dataset 1, 2, 3, ... and 5. Now I want to know if the classifiers differ significantly in their performance. Here's some data about the toy:
Performance<-c(2,3,3,2,3,1,2,2,1,1,3,1,3,2,3,2,1,2,1,2)
Dataset<-factor(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5))
Classifier<-factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4))
data<-data.frame(Classifier,Dataset,Performance)
Following the tutorial, I re-measured the one-way ANOVA. I interpreted my work as a dependent variable, classifiers as subjects, and datasets as a factor within subjects. Using aov I got:
model <- aov(Performance ~ Classifier + Error(factor(Dataset)), data=data)
summary(model)
The output of the next output:
Error: factor(Dataset)
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 4 2.5 0.625
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Classifier 3 5.2 1.7333 4.837 0.0197 *
Residuals 12 4.3 0.3583
I get similar results using a linear mixed effects model:
model <- lme(Performance ~ Classifier, random = ~1|Dataset/Classifier,data=data)
result<-anova(model)
Then I tried to reproduce the results using ezANOVA to do the Mochlis test for Sphericity:
ezANOVA(data=data, dv=.(Performance), wid=.(Classifier), within=.(Dataset), detailed=TRUE, type=3)
The output of the next output:
Effect DFn DFd SSn SSd F p p<.05 ges
1 (Intercept) 1 3 80.0 5.2 46.153846 0.00652049 * 0.8938547
2 Dataset 4 12 2.5 4.3 1.744186 0.20497686 0.2083333
This clearly does not match the previous result with aov / lme. However, when I trade "Performance" for "Classifier" in the ezANOVA definition, I get the expected results.
Now I am wondering if my tutorial is wrong (definition of aov) or if I misunderstood the syntax of ezANOVA. Also, why am I only getting Mauchly test results when I rewrite the ezANOVA statement, but not in the first case?
source to share
Since you want to compare classifiers, not datasets, inside the factor is the classifier, and inside the ID is the dataset. Therefore, the correct syntax for your ezANOVA example is:
ezANOVA(data=data, dv=.(Performance), within=.(Classifier), wid=.(Dataset), detailed=TRUE)
Btw, no need to specify the type of sums of squares. Since you only have one factor, all types of sums of squares will give the same results.
source to share