R passes data parameters. Through function calls

so if I have a data table defined as:

> dt <- data.table (x=c(1,2,3,4), y=c("y","n","y","m"), z=c("pickle",3,8,"egg"))

    > dt
        x   y        z 
    1:  1   y   pickle
    2:  2   n        3
    3:  3   y        8
    4:  4   m      egg

      

And the variable

    fn <- "z"

      

I get that I can pull the column out of the data.table like this:

    > dt[,fn, with=FALSE]

      

What I don't know how to do is table.table, equivalent to the following:

    > factorFunction <- function(df, fn) {
      df[,fn] <- as.factor(df[,fn])
      return(df)
     }

      

If I set fn = "x" and factorFunction (data.frame (dt), fn) is called, it works fine.

So, I am trying to use it using data.table, but it doesn't work.

    > factorFunction <- function(dt, fn) {
      dt[,fn, with=FALSE] <- as.factor(dt[,fn, with=FALSE])
      return(dt)
     }

      

Error in sort.list (y): 'x' must be atomic for 'sort.list' Did you name "sort" in the list?

+3


source to share


3 answers


You may try

 dt[,(fn):= factor(.SD[[1L]]),.SDcols=fn]

      

If there are multiple columns use lapply(.SD, factor)



Function wrapper

factorFunction <- function(df, fn) {
 df[, (fn):= factor(.SD[[1L]]), .SDcols=fn]
 }

 str(factorFunction(dt, fn))
 #Classes โ€˜data.tableโ€™ and 'data.frame':    4 obs. of  3 variables:
 #$ x: num  1 2 3 4
 #$ y: chr  "y" "n" "y" "m"
 #$ z: Factor w/ 4 levels "3","8","egg",..: 4 1 2 3

      

+4


source


Similar to @akrun's answer:



class(dt[[fn]])
#[1] "character"

setFactor <- function(DT, col) {
  #change the column type by reference
  DT[, c(col) := factor(DT[[col]])]
  invisible(NULL)
  }

setFactor(dt, fn)
class(dt[[fn]])
#[1] "factor"

      

+3


source


I don't recommend this as it is very uniiomatic:

factorFunction <- function(df,col){
  df[,col] <- factor(df[[col]])
  df
} 

      

On a positive note, it works in both R base and data.table

:

df <- setDF(copy(dt))

class(df[[fn]]) # character
df <- factorFunction(df,fn)
class(df[[fn]]) # factor

class(dt[[fn]]) # character
dt <- factorFunction(dt,fn)
class(dt[[fn]]) # factor

      

+2


source







All Articles