Walking through two columns (matrices) corresponding to columns and applying a function in R

So I have two matrices. Lets call them controls and patients. Each row represents a sample and each column represents the concentration of a specific protein. It looks like this:

        V1    V2    V3    V4     V5     V6    V7    V8    V9     V10    V11
sample1  1533.34  9.88  6.82 17.88  70.75 350.07 20.67 13.96 10.17  711.02 114.06
sample2  2311.30 12.74  6.82 17.88  80.71 505.96 34.36 19.66 18.70  863.70 181.43
sample3  1314.83 11.39 18.12 41.26 104.36 278.17 40.25 27.12 41.34 1100.00 160.83

      

This is just a small subset, I actually have more values. I want to compare this to another comparable table by column. Side question, is it correct to use a t-test in this case if the data is usually distributed? Anyway. I've tried the apply () function:

apply(controls,2,function(x) t.test(x, patients)$p.value)

      

And I am getting some results. But I doubt if I used the function correctly. Does it match two columns in two tables as it should be? Or have I used it incorrectly?

EDIT Oh yes. This is definitely not true. Because the mean for the column in the second table always stays the same.

+3


source to share


3 answers


Try

 mapply(function(x,y) t.test(x,y)$p.value, 
         as.data.frame(controls), as.data.frame(patients))
 #       V1        V2        V3        V4        V5        V6        V7        V8 
 #0.8481788 1.0000000 0.4605294 1.0000000 0.6436604 1.0000000    1.0000000 1.0000000 
 #       V9       V10       V11 
 #1.0000000 1.0000000 1.0000000 

      

assuming that "control" and "patients" matrix



data

controls <- structure(c(1253, 2311.3, 1314.83, 9.88, 12.74, 11.39, 
20.8, 
6.82, 18.12, 17.88, 17.88, 41.26, 70.75, 53.5, 104.36, 350.07, 
505.96, 278.17, 20.67, 34.36, 40.25, 13.96, 19.66, 27.12, 10.17, 
18.7, 41.34, 711.02, 863.7, 1100, 114.06, 181.43, 160.83),
.Dim = c(3L, 
11L), .Dimnames = list(c("sample1", "sample2", "sample3"), c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11")))

patients <- structure(c(1533.34, 2311.3, 1314.83, 9.88, 12.74, 11.39, 
6.82, 
6.82, 18.12, 17.88, 17.88, 41.26, 70.75, 80.71, 104.36, 350.07, 
505.96, 278.17, 20.67, 34.36, 40.25, 13.96, 19.66, 27.12, 10.17, 
18.7, 41.34, 711.02, 863.7, 1100, 114.06, 181.43, 160.83),
.Dim = c(3L, 
11L), .Dimnames = list(c("sample1", "sample2", "sample3"), c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11")))

      

+3


source


Assuming "patients" and "controls" data.frame

, try:



results <- dplyr::summarise_each(patients, funs(t.test(., controls$.)$p.value))

      

+3


source


I'm having trouble parsing your question, but I may have an answer.

If you are looking for the rate of column matching between two equivalent-sized matrices, there is a very simple way to do it:

colMeans(controls==patients)

      

To find the absolute number of matches between matrices:

colSums(controls==patients)

      

0


source







All Articles