What is the difference between cor and cor.test in R
I have a dataframe in which its columns represent different samples of an experiment. I wanted to find the correlation between these samples. So the correlation between samples v2 and v3 between samples v2 and v4, .... This is a dataframe:
> head(t1)
V2 V3 V4 V5 V6
1 0.12725011 0.051021886 0.106049328 0.09378767 0.17799444
2 0.86096784 1.263327211 3.073650624 0.75607466 0.92244361
3 0.45791031 0.520207274 1.526476608 0.67499102 0.49817761
4 0.00000000 0.001139721 0.003158557 0.00000000 0.00000000
5 0.13383965 0.098943019 0.099922146 0.13871867 0.09750611
6 0.01016334 0.010187671 0.025410170 0.00000000 0.02369374
> nrow(t1)
[1] 23367
if i run cor function on this dataframe to get correlation between samples (columns) i get NA for all samples:
> cor(t1, method= "spearman")
V2 V3 V4 V5 V6
V2 1 NA NA NA NA
V3 NA 1 NA NA NA
V4 NA NA 1 NA NA
V5 NA NA NA 1 NA
V6 NA NA NA NA 1
but if i run this:
> cor.test(t1[,1],t1[,2], method="spearman")$estimate
rho
0.92394
it's different. Why is this so? What is the correct way to get the correlation between these samples? Thank you in advance.
Your data contains values NA
.
From ?cor
:
If use is "everyone", NAs will be distributed conceptually, i.e. the total will be NA when one of its observation contributions is NA.
From ?cor.test
na.action is a function that specifies what should happen when the data contains a neural network. By default getOption ("na.action").
On my system:
getOption("na.action")
[1] "na.omit"
Use which(!is.finite(t1))
to find problem values ββand which(is.na(t1))
to find values NA
. cor
returns NaN
if there are values ββin your data Inf
.