Nested ifelse: improved syntax

Description

Function

ifelse () allows you to filter the values ​​in a vector through a series of tests, each of which performs different actions in case of a positive result. For example, let xx

's say data.frame is like this:

xx <- data.frame(a=c(1,2,1,3), b=1:4)
xx

      

ab
1 1
2 2
1 3
3 4

Suppose you want to create a new column c , from column b , but depending on the values ​​in column a in the following:

For each line

  • if the value in column a is 1, the value in column c is the same as the value in column b .
  • if the value in column a is 2, the value in column c is 100 times the value in column b .
  • otherwise, the value in column c is the negative value in column b .

Using ifelse () the solution could be:

xx$c <- ifelse(xx$a==1, xx$b, 
               ifelse(xx$a==2, xx$b*100,
                      -xx$b))
xx

      

abc
1 1 1
2 2 200
1 3 3
3 4 -4

Problem 1

An aesthetic problem arises when the number of tests increases, say four tests:

xx$c <- ifelse(xx$a==1, xx$b, 
           ifelse(xx$a==2, xx$b*100,
                  ifelse(xx$a==3, ...,
                         ifelse(xx$a==4, ...,
                                ...))))

      

I found a partial solution to the problem in this page , which is to define the functions if.else_ (), i_ (), e_ () as follows:

library(lazyeval)
i_ <- function(if_stat, then) {
    if_stat <- lazyeval::expr_text(if_stat)
    then    <- lazyeval::expr_text(then)
    sprintf("ifelse(%s, %s, ", if_stat, then)
}

e_ <- function(else_ret) {
    else_ret <- lazyeval::expr_text(else_ret)
    else_ret
}

if.else_ <- function(...) {
    args <- list(...)

    for (i in 1:(length(args) - 1) ) {
        if (substr(args[[i]], 1, 6) != "ifelse") {
            stop("All but the last argument, need to be if.then_ functions.", call. = FALSE)
        }
    }
    if (substr(args[[length(args)]], 1, 6) == "ifelse"){
        stop("Last argument needs to be an else_ function.", call. = FALSE)
    }
    args$final <- paste(rep(')', length(args) - 1), collapse = '')
    eval_string <- do.call('paste', args)
    eval(parse(text = eval_string))
}

      

So the problem mentioned in the Description can be rewritten as follows:

xx <- data.frame(a=c(1,2,1,3), b=1:4)
xx$c <- if.else_(
    i_(xx$a==1, xx$b),
    i_(xx$a==2, xx$b*100),
    e_(-xx$b)
) 
xx

      

abc
1 1 1
2 2 200
1 3 3
3 4 -4

And the code for the four tests would be simple:

xx$c <- if.else_(
    i_(xx$a==1, xx$b),
    i_(xx$a==2, xx$b*100),
    i_(xx$a==3, ...), # dots meaning actions for xx$a==3
    i_(xx$a==4, ...), # dots meaning actions for xx$a==4
    e_(...)           # dots meaning actions for any other case
) 

      

Problem 2 and question

This code appears to solve the problem. Then I wrote the following test function:

test.ie <- function() {
    dd <- data.frame(a=c(1,2,1,3), b=1:4)
    if.else_(
        i_(dd$a==1, dd$b),
        i_(dd$a==2, dd$b*100),
        e_(-dd$b)
    ) # it should give c(1, 200, 3, -4)
}

      

When I tried the test:

 test.ie()

      

it says the following error message:

Error in ifelse (dd $ a == 1, dd $ b, ifelse (dd $ a == 2, dd $ b * 100, -dd $ b)):
 object 'dd' not found

Question

Since the if.else _ () syntax constructor is not meant to be run from the console alone, is there a way to "know" the variables from the function that calls it?

Note

In "The best way to replace a long ifelse structure in R ", a similar problem has been posted. However, this solution focuses on constructing a new table column with the given constant values (slots "then" or "else" of the ifelse () function), whereas my case concerns the syntax problem where slots "then" or "else" can even be expressions in terms of other elements or data.frame variables.

+3


source to share


3 answers


With full respect for the OP's wonderful effort to improve nesting ifelse()

, I prefer a different approach that I find easy to write, concise, convenient, and fast:

xx <- data.frame(a=c(1L,2L,1L,3L), b=1:4)

library(data.table)
# coerce to data.table, and set the default first
setDT(xx)[, c:= -b]
xx[a == 1L, c := b]        # 1st special case
xx[a == 2L, c := 100L*b]   # 2nd special case, note use of integer 100L
# xx[a == 3L, c := ...]    # other cases
# xx[a == 4L, c := ...]
#...

xx
#   a b   c
#1: 1 1   1
#2: 2 2 200
#3: 1 3   3
#4: 3 4  -4     

      

Note that for the second special case, b

multiply by an integer constant 100L

to ensure that the right-hand sides of all types are integers to avoid double conversion.


Edit 2: This can also be written in an even more concise (but still maintainable) way as a one-liner:



setDT(xx)[, c:= -b][a == 1L, c := b][a == 2L, c := 100*b][]

      

data.table

chaining works here because it c

updates in place, so that subsequent expressions act on all rows xx

, even if the previous expression was a selective update of a subset of rows.


Edit 1: This approach can be implemented with an R base as well:

xx <- data.frame(a=c(1L,2L,1L,3L), b=1:4)

xx$c <- -xx$b
idx <- xx$a == 1L; xx$c[idx] <- xx$b[idx]
idx <- xx$a == 2L; xx$c[idx] <- 100 * xx$b[idx]

xx
#  a b   c
#1 1 1   1
#2 2 2 200
#3 1 3   3
#4 3 4  -4

      

+1


source


I think you can use dplyr::case_when

internally dplyr::mutate

to achieve this.

library(dplyr)

df <- tibble(a=c(1,2,1,3), b=1:4)

df %>% 
  mutate(
    foo = case_when(
      .$a == 1 ~ .$b,
      .$a == 2 ~ .$b * 100L,
      TRUE   ~ .$b * -1L
    )
  )

#> # A tibble: 4 x 3
#>       a     b   foo
#>   <dbl> <int> <int>
#> 1     1     1     1
#> 2     2     2   200
#> 3     1     3     3
#> 4     3     4    -4

      



In an upcoming update, dplyr 0.6.0

you don't need to use akward bypass .$

and you can simply use:

df %>% 
  mutate(
    foo = case_when(
      a == 1 ~ b,
      a == 2 ~ b * 100L,
      TRUE   ~ b * -1L
    )
  )

      

+8


source


Taking into account MrFlick's recommendation , I recoded the if.else _ () function as follows:

if.else_ <- function(...) {
    args <- list(...)

    for (i in 1:(length(args) - 1) ) {
        if (substr(args[[i]], 1, 6) != "ifelse") {
            stop("All but the last argument, need to be if.then_ functions.", call. = FALSE)
        }
    }
    if (substr(args[[length(args)]], 1, 6) == "ifelse"){
        stop("Last argument needs to be an else_ function.", call. = FALSE)
    }
    args$final <- paste(rep(')', length(args) - 1), collapse = '')
    eval_string <- do.call('paste', args)
    eval(parse(text = eval_string), envir = parent.frame())
}

      

The test.ie () function now works correctly

test.ie()

      

[1] 1 200 3 -4

+2


source







All Articles