What is the difference between cor and cor.test in R

I have a dataframe in which its columns represent different samples of an experiment. I wanted to find the correlation between these samples. So the correlation between samples v2 and v3 between samples v2 and v4, .... This is a dataframe:

> head(t1)
      V2          V3          V4         V5         V6
1 0.12725011 0.051021886 0.106049328 0.09378767 0.17799444
2 0.86096784 1.263327211 3.073650624 0.75607466 0.92244361
3 0.45791031 0.520207274 1.526476608 0.67499102 0.49817761
4 0.00000000 0.001139721 0.003158557 0.00000000 0.00000000
5 0.13383965 0.098943019 0.099922146 0.13871867 0.09750611
6 0.01016334 0.010187671 0.025410170 0.00000000 0.02369374
> nrow(t1)
[1] 23367

      

if i run cor function on this dataframe to get correlation between samples (columns) i get NA for all samples:

> cor(t1, method= "spearman")
V2 V3 V4 V5 V6
V2  1 NA NA NA NA
V3 NA  1 NA NA NA
V4 NA NA  1 NA NA
V5 NA NA NA  1 NA
V6 NA NA NA NA  1

      

but if i run this:

> cor.test(t1[,1],t1[,2], method="spearman")$estimate
rho 
0.92394 

      

it's different. Why is this so? What is the correct way to get the correlation between these samples? Thank you in advance.

+3


source to share


1 answer


Your data contains values NA

.

From ?cor

:

If use is "everyone", NAs will be distributed conceptually, i.e. the total will be NA when one of its observation contributions is NA.

From ?cor.test



na.action is a function that specifies what should happen when the data contains a neural network. By default getOption ("na.action").

On my system:

getOption("na.action")
[1] "na.omit"

      

Use which(!is.finite(t1))

to find problem values ​​and which(is.na(t1))

to find values NA

. cor

returns NaN

if there are values ​​in your data Inf

.

+5


source







All Articles