Double loop to fill the correlation matrix

I have a dataset like this

set.seed(1)
a = abs(rnorm(10, mean = 0, sd= 1))
b = abs(rnorm(10, mean = 0, sd= 1))
c = abs(rnorm(10, mean = 0, sd= 1))
d = abs(rnorm(10, mean = 0, sd= 1))
df = as.data.frame(cbind(a, b, c, d))

      

And I want to get a table

   c   d
a 0.5 0.1
b 0.8 0.3

      

where cols and rows are variables and cells are the correlation coefficients between variables.

I am doing as below

for(j in df[, 1:2])           {
for(i in df[, 3:4]) {

  k=abs(cor.test(j, i, method = c( "spearman"))$estimate)
  cat(k, '\n')
  y <- rbind(y, k)
}}
y

      

and get

rho
k 0.175757576
k 0.006060606
k 0.151515152
k 0.054545455

      

I used this post Using a double loop to fill a matrix in R

mat<-matrix(list(c(NA,NA)), nrow=2, ncol=2)
for(j in df[, 1:2])           {
  for(i in df[, 3:4]) {

    mat[i,j][[1]]=abs(cor.test(j, i, method = c( "spearman"))$estimate)

  }}
mat

      

and i get

     [,1]      [,2]     
[1,] Logical,2 Logical,2
[2,] Logical,2 Logical,2

      

how to fill in the table? Or can I fill it up without a loop?

  • There are many variables in real dataset and I cannot use tools like ggpairs

+3


source to share


2 answers


I would calculate the correlation matrix for df

one time and then a subset of whatever combinations I need. This way you will not need to run cor

multiple times.



m = cor(df, method = "spearman")
m[row.names(m) %in% c("a","b"), colnames(m) %in% c("c","d")]
#           c           d
#a 0.05454545 -0.40606061
#b 0.75757576  0.05454545

      

+2


source


A function cor()

can do this:

set.seed(1)
a = abs(rnorm(10, mean = 0, sd= 1))
b = abs(rnorm(10, mean = 0, sd= 1))
c = abs(rnorm(10, mean = 0, sd= 1))
d = abs(rnorm(10, mean = 0, sd= 1))
#### df = as.data.frame(cbind(a, b, c, d)) # not used
cor(cbind(a,b), cbind(c,d))
# > cor(cbind(a,b), cbind(c,d))
#           c          d
# a 0.5516642 -0.3918783
# b 0.8200195  0.1474773

      

And you can do abs()

for the desired result:

abs(cor(cbind(a,b), cbind(c,d)))
# > abs(cor(cbind(a,b), cbind(c,d)))
# c         d
# a 0.5516642 0.3918783
# b 0.8200195 0.1474773

      

Spearman:



abs(cor(cbind(a,b), cbind(c,d), method = "spearman"))
# > abs(cor(cbind(a,b), cbind(c,d), method = "spearman"))
# c          d
# a 0.05454545 0.40606061
# b 0.75757576 0.05454545

      

If you want to use your dataframe, you can do:

df = as.data.frame(cbind(a, b, c, d))
rm(a,b,c,d) ### to be sure that a, ..., d are from the dataframe.
with(df, abs(cor(cbind(a,b), cbind(c,d), method = "spearman")))

      

or

abs(cor(df[,c("a", "b")], df[,c("c","d")], method = "spearman"))

      

+1


source







All Articles