Show me if the maximum value will carry over to another column

Question

Show me if the maximum value will carry over to another column

I've prepared some sample data, so let's take a look at it:

> dput(example1)
structure(list(Fr1 = c(0.2, 0, 0, 0, 0, 0), Fr2 = c(0.7, 0, 0, 
0, 0, 0), Fr3 = c(1, 0.35, 0, 0, 0, 0), Fr4 = c(0.1, 1, 0, 0, 
0.5, 0), Fr5 = c(0, 0.4, 0, 0, 1, 0), Fr6 = c(0, 0, 0, 0, 0.3, 
0), Fr7 = c(0, 0, 0, 0.7, 0, 0), Fr8 = c(0, 0, 0, 1, 0, 0), Fr9 = c(0, 
0, 0, 1, 0, 0), Fr10 = c(0, 0, 0, 0.65, 0, 0.7), Fr11 = c(0, 
0, 0, 0.2, 0, 1)), .Names = c("Fr1", "Fr2", "Fr3", "Fr4", "Fr5", 
"Fr6", "Fr7", "Fr8", "Fr9", "Fr10", "Fr11"), row.names = c("Mazda RX4", 
"Mazda RX4 Wag", "Datsun 710", "Hornet 4 Drive", "Hornet Sportabout", 
"Valiant"), class = "data.frame")

> dput(example2)
structure(list(Fr1 = c(1, 0, 0, 0, 0, 0), Fr2 = c(0.7, 0, 0, 
0, 0, 0), Fr3 = c(0.2, 0, 0, 0, 0, 0), Fr4 = c(0.1, 0, 0, 0, 
0.5, 0), Fr5 = c(0, 0.1, 0, 0, 1, 0), Fr6 = c(0, 0, 0, 0, 0.3, 
0), Fr7 = c(0, 0.8, 0, 0.7, 0, 0), Fr8 = c(0, 1, 0, 1, 0, 0), 
    Fr9 = c(0, 0.3, 0, 1, 0, 0), Fr10 = c(0, 0, 0, 0.65, 0, 0.7
    ), Fr11 = c(0, 0, 0, 0.2, 0, 1)), .Names = c("Fr1", "Fr2", 
"Fr3", "Fr4", "Fr5", "Fr6", "Fr7", "Fr8", "Fr9", "Fr10", "Fr11"
), row.names = c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710", 
"Hornet 4 Drive", "Hornet Sportabout", "Valiant"), class = "data.frame")

So, we have 2 data frames that I would like to compare. As you can see, there are only numbers from 0 to 1 in all cells. The number 1 is the maximum, and it must appear at least once in each row. What matters to me is which column I can find the maximum in and compare if it is in the same column in another dataframe.

Example 1:

    Fr1 Fr2  Fr3 Fr4 Fr5 Fr6 Fr7 Fr8 Fr9 Fr10 Fr11
Mazda RX4         0.2 0.7 1.00 0.1 0.0 0.0 0.0   0   0 0.00  0.0
Mazda RX4 Wag     0.0 0.0 0.35 1.0 0.4 0.0 0.0   0   0 0.00  0.0
Datsun 710        0.0 0.0 0.00 0.0 0.0 0.0 0.0   0   0 0.00  0.0
Hornet 4 Drive    0.0 0.0 0.00 0.0 0.0 0.0 0.7   1   1 0.65  0.2
Hornet Sportabout 0.0 0.0 0.00 0.5 1.0 0.3 0.0   0   0 0.00  0.0
Valiant           0.0 0.0 0.00 0.0 0.0 0.0 0.0   0   0 0.70  1.0

Example 2:

                  Fr1 Fr2 Fr3 Fr4 Fr5 Fr6 Fr7 Fr8 Fr9 Fr10 Fr11
Mazda RX4           1 0.7 0.2 0.1 0.0 0.0 0.0   0 0.0 0.00  0.0
Mazda RX4 Wag       0 0.0 0.0 0.0 0.1 0.0 0.8   1 0.3 0.00  0.0
Datsun 710          0 0.0 0.0 0.0 0.0 0.0 0.0   0 0.0 0.00  0.0
Hornet 4 Drive      0 0.0 0.0 0.0 0.0 0.0 0.7   1 1.0 0.65  0.2
Hornet Sportabout   0 0.0 0.0 0.5 1.0 0.3 0.0   0 0.0 0.00  0.0
Valiant             0 0.0 0.0 0.0 0.0 0.0 0.0   0 0.0 0.70  1.0

I only made differences for the first and second lines in this example to make things easier, but there might be a difference in all 3000 lines in my actual data. As I said, there can be more than 1 "maximum" on each line, but usually not more than 2 - the number 1 appears twice.

As an output, I need the name of the row, and if the maximum move is (YES) or is in the same column (NO). Can this be done?

To show you that in these datasets the two strings are different:

Example 1:

Example 2

Edit:

Real data:

structure(list(X10 = c(0, 0, 0, 0, 0, 0), X33.95 = c(0, 0, 0, 
0, 0, 0), X58.66 = c(0, 0, 0, 0, 0, 0.164279901), X84.42 = c(0, 
0, 0, 0, 0, 0), X110.21 = c(0.04925863, 0, 0, 0, 0, 0), X134.16 = c(0.4981384, 
0, 0, 0, 0, 0), X164.69 = c(1, 0, 1, 0, 0, 0), X199.1 = c(0.367449159, 
0, 0, 0, 1, 0), X234.35 = c(0.19587217, 0, 0, 0.96458515, 0.93848979, 
0), X257.19 = c(0, 0, 0, 0.77155521, 0, 0), X361.84 = c(0, 0, 
1, 0.76396661, 0, 0), X432.74 = c(0, 0, 0.81609991, 0.33773581, 
0, 0), X506.34 = c(0, 0, 0.81609991, 0.1390399, 0, 0), X581.46 = c(0, 
0, 0.96019504, 0.86300673, 0, 0), X651.71 = c(0, 0, 0, 0.77764596, 
0, 0), X732.59 = c(0, 0, 1, 0.45950141, 0, 0), X817.56 = c(0, 
0, 0, 0.14639304, 0, 0), X896.24 = c(0, 0.4013747, 0, 0.800272, 
0, 0), X971.77 = c(0, 0.32393615, 0, 0.74026623, 0, 0), X1038.91 = c(0, 
0.4168461, 0, 0.6808022, 0, 0), NA..1 = c(0, 0.8750537, 0, 1, 
0, 0), NA..2 = c(0, 1, 0, 0, 0, 0), NA..3 = c(0, 0.6069765, 0, 
1, 0, 0), NA..4 = c(0, 0.53831215, 0, 0.65073089, 0, 0)), .Names = c("X10", 
"X33.95", "X58.66", "X84.42", "X110.21", "X134.16", "X164.69", 
"X199.1", "X234.35", "X257.19", "X361.84", "X432.74", "X506.34", 
"X581.46", "X651.71", "X732.59", "X817.56", "X896.24", "X971.77", 
"X1038.91", "NA..1", "NA..2", "NA..3", "NA..4"), row.names = c(NA, 
6L), class = "data.frame")

New edit:

I do not understand...

> apply(alcr_ready,2,is.numeric)
     NA.      X10   X33.95   X58.66   X84.42  X110.21  X134.16  X164.69   X199.1  X234.35  X257.19  X361.84  X432.74 
   FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE 
 X506.34  X581.46  X651.71  X732.59  X817.56  X896.24  X971.77 X1038.91    NA..1    NA..2    NA..3    NA..4 
   FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE

Checking again:

> class(alcr_ready[2,2])
[1] "numeric"

Edit again:

'data.frame':   2188 obs. of  25 variables:
 $ NA.     : Factor w/ 2890 levels "AT1G01050","AT1G01080",..: 1 2 3 4 5 6 7 10 11 12 ...
 $ X10     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ X33.95  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ X58.66  : num  0 0 0 0 0 ...
 $ X84.42  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ X110.21 : num  0.0493 0 0 0 0 ...
 $ X134.16 : num  0.498 0 0 0 0 ...
 $ X164.69 : num  1 0 1 0 0 0 0 0 0 0 ...
 $ X199.1  : num  0.367 0 0 0 1 ...
 $ X234.35 : num  0.196 0 0 0.965 0.938 ...
 $ X257.19 : num  0 0 0 0.772 0 ...
 $ X361.84 : num  0 0 1 0.764 0 ...
 $ X432.74 : num  0 0 0.816 0.338 0 ...
 $ X506.34 : num  0 0 0.816 0.139 0 ...
 $ X581.46 : num  0 0 0.96 0.863 0 ...
 $ X651.71 : num  0 0 0 0.778 0 ...
 $ X732.59 : num  0 0 1 0.46 0 ...
 $ X817.56 : num  0 0 0 0.146 0 ...
 $ X896.24 : num  0 0.401 0 0.8 0 ...
 $ X971.77 : num  0 0.324 0 0.74 0 ...
 $ X1038.91: num  0 0.417 0 0.681 0 ...
 $ NA..1   : num  0 0.875 0 1 0 ...
 $ NA..2   : num  0 1 0 0 0 0 0 0 0 0 ...
 $ NA..3   : num  0 0.607 0 1 0 ...
 $ NA..4   : num  0 0.538 0 0.651 0 ...

Attempted code:

> indx1 <- max.col(alcr_ready, 'first')==max.col(tps_ready, 'first')
Warning messages:
1: In max.col(alcr_ready, "first") : NAs introduced by coercion
2: In max.col(tps_ready, "first") : NAs introduced by coercion
> indx2 <- max.col(alcr_ready, 'last')==max.col(tbl_tps, 'last')
Warning messages:
1: In max.col(alcr_ready, "last") : NAs introduced by coercion
2: In max.col(tbl_tps, "last") : NAs introduced by coercion
3: In max.col(alcr_ready, "last") == max.col(tbl_tps, "last") :
  longer object length is not a multiple of shorter object length

+3

r

Shaxi liver Dec 16 14 at 12:16

source to share

2 answers

If you want to know if the maximum of each row will be in one column in two datasets, you can do:

# find the column(s) with maximum in each dataset (in case of ex-aequo, the column numbers are separated by ";") :
max1<-apply(example1,1,function(x) paste(which(x==max(x)),collapse=";"))
max2<-apply(example2,1,function(x) paste(which(x==max(x)),collapse=";"))

# compare the 2 vectors (the 2 last lines are probably the more interesting) :
all(max1==max2)
any(max1==max2) 
sum(max1!=max2) 
which(max1 != max2)

In your example:

> max1
                Mazda RX4             Mazda RX4 Wag                Datsun 710            Hornet 4 Drive         Hornet Sportabout                   Valiant 
                      "3"                       "4" "1;2;3;4;5;6;7;8;9;10;11"                     "8;9"                       "5"                      "11" 
> max2
                Mazda RX4             Mazda RX4 Wag                Datsun 710            Hornet 4 Drive         Hornet Sportabout                   Valiant 
                      "1"                       "8" "1;2;3;4;5;6;7;8;9;10;11"                     "8;9"                       "5"                      "11" 
> which(max1!=max2)
    Mazda RX4 Mazda RX4 Wag 
            1             2 
> sum(max1!=max2)
[1] 2

+2

Cath Dec 16 14 at 12:24

source to share

akrun · Accepted Answer · 2014-12-16T12:25:21+0000

May be

 c('NO', 'YES')[(max.col(example1, 'first')==max.col(example2, 'first'))+1]
# [1] "NO"  "NO"  "YES" "YES" "YES" "YES"

If YES

denotes that it has been max

moved to another column, it should be reversed

c('YES', 'NO')[(max.col(example1, 'first')==max.col(example2, 'first'))+1]
# [1] "YES" "YES" "NO"  "NO"  "NO"  "NO"

If there is a possibility up to 2 1

for a row

 indx1 <- max.col(example1, 'first')==max.col(example2, 'first')
 indx2 <- max.col(example1, 'last')==max.col(example2, 'last')
  c('YES', 'NO')[(indx1|indx2)+1]

Show me if the maximum value will carry over to another column

More articles: