Shorten nested ifelse

If the following datasheet is given and we would like to compare x1, therefore, with x2-x5, the following could be used:

set.seed(1)
library(data.table)
TDT <- data.table(x1 = round(rnorm(100,0.75,0.3),2),
                  x2 = round(rnorm(100,0.75,0.3),2),
                  x3 = round(rnorm(100,0.75,0.3),2),
                  x4 = round(rnorm(100,0.75,0.3),2),
                  x5 = round(rnorm(100,0.75,0.3),2))

TDT[,compare := ifelse(x1 < x2,1,ifelse(x1 < x3,2,ifelse(x1 < x4,3,ifelse(x1 < x5,4,5))))]

      

So, if x1 < x2

, then compare == 1

, etc.

Now in my example I have a lot more columns to compare x1 with. Is there a way to write this more concisely, i.e. Without nested ifelse?

+3


source to share


2 answers


We can do this by using Map

and max.col

indata.table



TDT[, compare := {d1 <- as.data.table(Map(function(x) x1 < x, .SD))
       max.col(d1, "first") *(c(5, 1)[((Reduce(`+`, d1)!=0)+1)])}, .SDcols = x2:x5]

#OP code
v1 <- TDT[, ifelse(x1 < x2,1,ifelse(x1 < x3,2,ifelse(x1 < x4,3,ifelse(x1 < x5,4,5))))]
identical(v1, TDT$compare)
#[1] TRUE

      

+5


source


This saves a little typing and is easy to read.

TDT[, compare := dplyr::case_when(
      x1 < x2 ~ 1,
      x1 < x3 ~ 2,
      x1 < x4 ~ 3,
      x1 < x5 ~ 4,
      TRUE ~ 5)]

      

If you have so many columns that you don't want to mention them by name, you can use:



apply(TDT, 1, function (x) which(x[1] < x[2:5])[1]) 

      

where x [2: 5] should be replaced with the appropriate set of columns.

+5


source







All Articles