R use string to refer to column

I would like to multiply a dataframe by accessing a column with a row and selecting the values ​​of that column that satisfy a condition. From the following code

 employee <- c('John Doe','Peter Gynn','Jolie Hope')
 salary <- c(21000, 23400, 26800)
 startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
 employ.data <- data.frame(employee, salary, startdate)
 salary_string <- "salary"

      

I want to get all salaries over 23000 using salary_line to reference the column name.

I tried without success:

set <- subset(employ.data, salary_string > 23000)
set2 <- employ.data[, employ.data$salary_string > 23000)

      

This doesn't seem to work because rank_string is of type character, but I need some sort of "column name object". Using as.name (salary_string) doesn't work. I know I can get a subset using

set <- subset(employ.data, salary > 23000)

      

But my goal is to use a column name that is of type character (salary_string) once with a subset of (employee.data, ...) and once using .data [, ...]

+3


source to share


3 answers


Short answer: don't use subset

, but something like



employ.data[employ.data[salary_string]>23000,]

      

+5


source


Here's another idea:

dplyr::filter(employ.data, get(salary_string) > 23000)

      



What gives:

#    employee salary  startdate
#1 Peter Gynn  23400 2008-03-25
#2 Jolie Hope  26800 2007-03-14

      

+3


source


To show you how to achieve the result with subset()

:

The problem you are having is subset()

using a non-standard estimate. Here's one way to replace your string with a function subset()

.

## set up an unevaluated call
e <- call(">", as.name(salary_string), 23000)
## evaluate it in subset()
subset(employ.data, eval(e))
#     employee salary  startdate
# 2 Peter Gynn  23400 2008-03-25
# 3 Jolie Hope  26800 2007-03-14

      

Or, as Stephen suggests, the following will work well.

subset(employ.data, eval(as.name(salary_string)) > 23000)

      

+2


source







All Articles