How to use `cor.test` to correlate specific columns?

I have the following sample data:

A<-rnorm(100)
B<-rnorm(100)
C<-rnorm(100)

v1<-as.numeric(c(1:100))
v2<-as.numeric(c(2:101))
v3<-as.numeric(c(3:102))
v2[50]<-NA
v3[60]<-NA
v3[61]<-NA

df<-data.frame(A,B,C,v1,v2,v3)

      

As you can see, df has 1 NA in column 5 and 2 NA in column 6. Now I would like to make a correlation matrix of col1 and 3 on the one hand and col2,4,5,6 on the other. Using the cor function in R:

cor(df[ , c(1,3)], df[ , c(2,4,5,6)], use="complete.obs")

#             B         v1         v2         v3
# A -0.007565203 -0.2985090 -0.2985090 -0.2985090
# C  0.032485874  0.1043763  0.1043763  0.1043763

      

It works. I would like to have both an evaluation and a p.value and so I switch to cor.test.

cor.test(df[ ,c(1,3)], df[ , c(2,4,5,6)], na.action = "na.exclude")$estimate

      

This does not work as "x" and "y" must be the same length. This error actually happens with or without NA in the data. It looks like cor.test does not understand (unlike cor) a query to correlate specific columns. Is there a solution to this problem?

+3


source to share


1 answer


You can use outer

to run test between all pairs of columns. Here X

and Y

are data frames extended from df

, consisting of 8 columns.

outer(df[, c(1,3)], df[, c(2,4,5,6)], function(X, Y){
    mapply(function(...) cor.test(..., na.action = "na.exclude")$estimate,
           X, Y)
})

      



You even get output in the same form as cor

:

           B          v1          v2          v3
A 0.07844426  0.01829566  0.01931412  0.01528329
C 0.11487140 -0.14827859 -0.14900301 -0.15534569

      

+3


source







All Articles