Exceeding the limit with ifelse statements
Problem: I wrote a giant piece of code with over 100 operators ifelse
to find out that there is a limit on the number of operators ifelse
: exceeding 50 will throw an error. Anyway, I know there is a more efficient way to do what I am trying to do.
Purpose: An attempt to write a function to recalculate many string variants (see example below) into understandable categories (for example, below). I use str_detect
to give T / F and then jump to the correct category based on the answer. How can I do this without more than 100 operators ifelse
(I have a lot more categories).
Example:
mydf <- data_frame(answer = sample(1:5, 10, replace = T),
location = c("at home", "home", "in a home",
"school", "my school", "School", "Work", "work",
"working", "work usually"))
loc_function <- function(x) {
home <- "home"
school <- "school"
work <- "work"
ifelse(str_detect(x, regex(home, ignore_case = T)), "At Home",
ifelse(str_detect(x, regex(school, ignore_case = T)), "At
School",
ifelse(str_detect(x, regex(work, ignore_case = T)), "At
Work", x)))
}
### Using function to clean up messy strings (and recode first column too) into clean categories
mycleandf <- mydf %>%
as_data_frame() %>%
mutate(answer = ifelse(answer >= 2, 1, 0)) %>%
mutate(location = loc_function(location)) %>%
select(answer, location)
mycleandf
# A tibble: 10 x 2
answer location
<dbl> <chr>
1 1 At Home
2 1 At Home
3 1 At Home
4 1 At School
5 1 At School
6 1 At School
7 1 At Work
8 0 At Work
9 1 At Work
10 0 At Work
source to share
You can put your templates in a named vector (note Other = ""
, this is dropped when none of your templates match a string):
patterns <- c("At Home" = "home", "At School" = "school", "At Work" = "work", "Other" = "")
Then loop over the pattern and check if the string contains the pattern:
match <- sapply(patterns, grepl, mydf$location, ignore.case = T)
Finally, create a new buy column, checking the name of the matching template you want to replace, if nothing matches, go back to Other:
mydf$clean_loc <- colnames(match)[max.col(match, ties.method = "first")]
mydf
# A tibble: 10 x 3
# answer location clean_loc
# <int> <chr> <chr>
# 1 3 at home At Home
# 2 3 home At Home
# 3 3 in a home At Home
# 4 3 school At School
# 5 2 my school At School
# 6 4 School At School
# 7 5 Work At Work
# 8 1 work At Work
# 9 2 working At Work
#10 1 work usually At Work
source to share
Instead of nesting conditions you could fulfill them consistently. Using a loop for
:
# Store the find-replace pairs in a data frame
word_map <- data.frame(pattern = c("home", "school", "work"),
replacement = c("At Home", "At School", "At Work"),
stringsAsFactors = FALSE)
word_map
pattern replacement
1 home At Home
2 school At School
3 work At Work
# Iterate through the pairs
for ( i in 1:nrow(word_map) ) {
pattern <- word_map$pattern[i]
replacement <- word_map$replacement[i]
mydf$location <- ifelse(grepl(pattern, mydf$location, ignore.case = TRUE), replacement, mydf$location)
}
mydf
answer location
1 4 At Home
2 4 At Home
3 1 At Home
4 5 At School
5 1 At School
6 2 At School
7 5 At Work
8 2 At Work
9 1 At Work
10 3 At Work
source to share