What is the difference between cor and cor.test in R
I have a dataframe in which its columns represent different samples of an experiment. I wanted to find the correlation between these samples. So the correlation between samples v2 and v3 between samples v2 and v4, .... This is a dataframe:
> head(t1)
V2 V3 V4 V5 V6
1 0.12725011 0.051021886 0.106049328 0.09378767 0.17799444
2 0.86096784 1.263327211 3.073650624 0.75607466 0.92244361
3 0.45791031 0.520207274 1.526476608 0.67499102 0.49817761
4 0.00000000 0.001139721 0.003158557 0.00000000 0.00000000
5 0.13383965 0.098943019 0.099922146 0.13871867 0.09750611
6 0.01016334 0.010187671 0.025410170 0.00000000 0.02369374
> nrow(t1)
[1] 23367
if i run cor function on this dataframe to get correlation between samples (columns) i get NA for all samples:
> cor(t1, method= "spearman")
V2 V3 V4 V5 V6
V2 1 NA NA NA NA
V3 NA 1 NA NA NA
V4 NA NA 1 NA NA
V5 NA NA NA 1 NA
V6 NA NA NA NA 1
but if i run this:
> cor.test(t1[,1],t1[,2], method="spearman")$estimate
rho
0.92394
it's different. Why is this so? What is the correct way to get the correlation between these samples? Thank you in advance.
source to share
Your data contains values NA
.
From ?cor
:
If use is "everyone", NAs will be distributed conceptually, i.e. the total will be NA when one of its observation contributions is NA.
From ?cor.test
na.action is a function that specifies what should happen when the data contains a neural network. By default getOption ("na.action").
On my system:
getOption("na.action")
[1] "na.omit"
Use which(!is.finite(t1))
to find problem values ββand which(is.na(t1))
to find values NA
. cor
returns NaN
if there are values ββin your data Inf
.
source to share