R use string to refer to column

Question

R use string to refer to column

I would like to multiply a dataframe by accessing a column with a row and selecting the values of that column that satisfy a condition. From the following code

 employee <- c('John Doe','Peter Gynn','Jolie Hope')
 salary <- c(21000, 23400, 26800)
 startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
 employ.data <- data.frame(employee, salary, startdate)
 salary_string <- "salary"

I want to get all salaries over 23000 using salary_line to reference the column name.

I tried without success:

set <- subset(employ.data, salary_string > 23000)
set2 <- employ.data[, employ.data$salary_string > 23000)

This doesn't seem to work because rank_string is of type character, but I need some sort of "column name object". Using as.name (salary_string) doesn't work. I know I can get a subset using

set <- subset(employ.data, salary > 23000)

But my goal is to use a column name that is of type character (salary_string) once with a subset of (employee.data, ...) and once using .data [, ...]

+3

string r subset

Simon Apr 29. 15 at 22:27

source to share

3 answers

Here's another idea:

dplyr::filter(employ.data, get(salary_string) > 23000)

What gives:

#    employee salary  startdate
#1 Peter Gynn  23400 2008-03-25
#2 Jolie Hope  26800 2007-03-14

+3

Steven beaupré Apr 29. 15 at 22:50

source to share

To show you how to achieve the result with subset()

:

The problem you are having is subset()

using a non-standard estimate. Here's one way to replace your string with a function subset()

.

## set up an unevaluated call
e <- call(">", as.name(salary_string), 23000)
## evaluate it in subset()
subset(employ.data, eval(e))
#     employee salary  startdate
# 2 Peter Gynn  23400 2008-03-25
# 3 Jolie Hope  26800 2007-03-14

Or, as Stephen suggests, the following will work well.

subset(employ.data, eval(as.name(salary_string)) > 23000)

+2

Rich scriven Apr 29. 15 at 22:41

source to share

cryo111 · Accepted Answer · 2015-04-29T22:32:07+0000

Short answer: don't use subset

, but something like

employ.data[employ.data[salary_string]>23000,]

R use string to refer to column

More articles: