Replacing specific values ββin a data frame as NA
Suppose I have a data.frame
names <- c("John", "Mark", "Larry", "Will", "Kate", "Daria", "Tom")
gender <- c("M", "M", "M", "M", "F", "F", "M")
mark <- c(1, 2, 3, 1, 2, 3, 1)
df <- data.frame(names, gender, mark)
df
names gender mark
1 John M 1
2 Mark M 2
3 Larry M 3
4 Will M 1
5 Kate F 2
6 Daria F 3
7 Tom M 1
I can't figure out how to replace certain values ββlike NAs
. For example, if I want mark
for Kate
, Daria
and Tom
be NAs
:
names gender mark
1 John M 1
2 Mark M 2
3 Larry M 3
4 Will M 1
5 Kate F NA
6 Daria F NA
7 Tom M NA
+3
source to share
2 answers
Try
df <- within(df, mark <- replace(mark, names %in% c('Kate', 'Daria', 'Tom'), NA))
df
# names gender mark
#1 John M 1
#2 Mark M 2
#3 Larry M 3
#4 Will M 1
#5 Kate F NA
#6 Daria F NA
#7 Tom M NA
or
df$mark[df$names %in% c('Kate', 'Daria', 'Tom')] <- NA
or
is.na(df$mark) <- df$names %in% c('Kate', 'Daria', 'Tom')
+3
source to share
is.na(df$mark[df$names %in% c('Kate', 'Daria', 'Tom')]) <- TRUE
This is a syntax that I find useful at times. In this case, not so fast.
Benchmark
big.df1 <- data.frame(names = rep(names, 1e3),
gender = rep(gender, 1e3),
mark = rep(mark, 1e3))
big.df4 <- big.df3 <- big.df2 <- big.df1
microbenchmark(
plafort = is.na(big.df1$mark[big.df1$names %in% c('Kate', 'Daria', 'Tom')]) <- TRUE,
akrun1 = within(big.df2, mark <- replace(mark, names %in% c('Kate', 'Daria', 'Tom'), NA)),
akrun2 = big.df3$mark[big.df3$names %in% c('Kate', 'Daria', 'Tom')] <- NA,
akrun3 = is.na(big.df4$mark) <- big.df4$names %in% c('Kate', 'Daria', 'Tom')
)
#
# Unit: microseconds
# expr min lq mean median uq
# plafort 389.623 408.9660 484.6090 426.9275 540.8135
# akrun1 287.381 319.3570 388.3125 357.2530 419.8220
# akrun2 193.035 204.2860 627.6559 227.7735 327.8440
# akrun3 208.431 221.6555 274.1615 235.2740 287.3825
# max neval
# 777.272 100
# 661.214 100
# 37325.194 100
# 1110.445 100
+1
source to share