Changing values ​​in multiple columns of a data frame using a lookup table

I am trying to change the value of the number of columns at once using a lookup table. They all use the same lookup table. I know how to do this for just one column - I would just use it merge

, but I'm having multiple column problems.

Below is an example data frame and an example lookup table. My actual data is much larger (~ 10K columns with 8 rows).

example <- data.frame(a = seq(1,5), b = seq(5,1), c=c(1,4,3,2,5))

lookup <- data.frame(number = seq(1,5), letter = LETTERS[seq(1,5)])

Ideally, I would end up with a data frame that looks like this:

example_of_ideal_output <- data.frame(a = LETTERS[seq(1,5)], b = LETTERS[seq(5,1)], c=LETTERS[c(1,4,3,2,5)])

Of course, in my real data, the framework is numbers, but the lookup table is much more complex, so I can't just use a type function LETTERS

to solve problems.

Thank you in advance!

+3


source to share


2 answers


Here's a solution that works with each column sequentially with lapply()

:

as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
##   a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E

      

Alternatively, if you don't mind going to matrix, you can achieve a "more vectorial" solution, since matrix will only allow you to call match()

and index lookup$letter

once for the entire input:



matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
##      [,1] [,2] [,3]
## [1,] "A"  "E"  "A"
## [2,] "B"  "D"  "D"
## [3,] "C"  "C"  "C"
## [4,] "D"  "B"  "B"
## [5,] "E"  "A"  "E"

      

(And of course you can go back to data.frame again via as.data.frame()

, although you will have to restore the column names as well if you want them, which can be done with setNames(...,names(example))

. But if you really want to stick with data.frame, my first solution is probably preferred.)

+4


source


Using dplyr

f <- function(x)setNames(lookup$letter, lookup$number)[x] 
library(dplyr)
example %>% 
  mutate_each(funs(f))
#  a b c
#1 A E A
#2 B D D
#3 C C C
#4 D B B
#5 E A E

      



Or using data.table

library(data.table)
setDT(example)[, lapply(.SD, f), ]
#   a b c
#1: A E A
#2: B D D
#3: C C C
#4: D B B
#5: E A E

      

+4


source







All Articles