Changing values ββin multiple columns of a data frame using a lookup table
I am trying to change the value of the number of columns at once using a lookup table. They all use the same lookup table. I know how to do this for just one column - I would just use it merge
, but I'm having multiple column problems.
Below is an example data frame and an example lookup table. My actual data is much larger (~ 10K columns with 8 rows).
example <- data.frame(a = seq(1,5), b = seq(5,1), c=c(1,4,3,2,5))
lookup <- data.frame(number = seq(1,5), letter = LETTERS[seq(1,5)])
Ideally, I would end up with a data frame that looks like this:
example_of_ideal_output <- data.frame(a = LETTERS[seq(1,5)], b = LETTERS[seq(5,1)], c=LETTERS[c(1,4,3,2,5)])
Of course, in my real data, the framework is numbers, but the lookup table is much more complex, so I can't just use a type function LETTERS
to solve problems.
Thank you in advance!
source to share
Here's a solution that works with each column sequentially with lapply()
:
as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
## a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E
Alternatively, if you don't mind going to matrix, you can achieve a "more vectorial" solution, since matrix will only allow you to call match()
and index lookup$letter
once for the entire input:
matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
## [,1] [,2] [,3]
## [1,] "A" "E" "A"
## [2,] "B" "D" "D"
## [3,] "C" "C" "C"
## [4,] "D" "B" "B"
## [5,] "E" "A" "E"
(And of course you can go back to data.frame again via as.data.frame()
, although you will have to restore the column names as well if you want them, which can be done with setNames(...,names(example))
. But if you really want to stick with data.frame, my first solution is probably preferred.)
source to share
Using dplyr
f <- function(x)setNames(lookup$letter, lookup$number)[x]
library(dplyr)
example %>%
mutate_each(funs(f))
# a b c
#1 A E A
#2 B D D
#3 C C C
#4 D B B
#5 E A E
Or using data.table
library(data.table)
setDT(example)[, lapply(.SD, f), ]
# a b c
#1: A E A
#2: B D D
#3: C C C
#4: D B B
#5: E A E
source to share