Use replace_na conditionally
I want to conditionally replace missing income before July 16, 2017 with zero using tidyverse.
My details
library(tidyverse)
library(lubridate)
df<- tribble(
~Date, ~Revenue,
"2017-07-01", 500,
"2017-07-02", 501,
"2017-07-03", 502,
"2017-07-04", 503,
"2017-07-05", 504,
"2017-07-06", 505,
"2017-07-07", 506,
"2017-07-08", 507,
"2017-07-09", 508,
"2017-07-10", 509,
"2017-07-11", 510,
"2017-07-12", NA,
"2017-07-13", NA,
"2017-07-14", NA,
"2017-07-15", NA,
"2017-07-16", NA,
"2017-07-17", NA,
"2017-07-18", NA,
"2017-07-19", NA,
"2017-07-20", NA
)
df$Date <- ymd(df$Date)
Date until which I want to conditionally replace NAs
max.date <- ymd("2017-07-16")
The result that I desire
# A tibble: 20 × 2
Date Revenue
<chr> <dbl>
1 2017-07-01 500
2 2017-07-02 501
3 2017-07-03 502
4 2017-07-04 503
5 2017-07-05 504
6 2017-07-06 505
7 2017-07-07 506
8 2017-07-08 507
9 2017-07-09 508
10 2017-07-10 509
11 2017-07-11 510
12 2017-07-12 0
13 2017-07-13 0
14 2017-07-14 0
15 2017-07-15 0
16 2017-07-16 0
17 2017-07-17 NA
18 2017-07-18 NA
19 2017-07-19 NA
20 2017-07-20 NA
The only way I could work this out was to split the df into multiple parts, update for NAs
and then the rbind
whole batch.
Can someone please help me to do this efficiently using tidyverse.
source to share
We can use mutate
the "Income" column replace
NA
with 0 to use a boolean condition that checks if the element is NA and "Date" is less than or equal to "max.date"
df %>%
mutate(Revenue = replace(Revenue, is.na(Revenue) & Date <= max.date, 0))
# A tibble: 20 x 2
# Date Revenue
# <date> <dbl>
# 1 2017-07-01 500
# 2 2017-07-02 501
# 3 2017-07-03 502
# 4 2017-07-04 503
# 5 2017-07-05 504
# 6 2017-07-06 505
# 7 2017-07-07 506
# 8 2017-07-08 507
# 9 2017-07-09 508
#10 2017-07-10 509
#11 2017-07-11 510
#12 2017-07-12 0
#13 2017-07-13 0
#14 2017-07-14 0
#15 2017-07-15 0
#16 2017-07-16 0
#17 2017-07-17 NA
#18 2017-07-18 NA
#19 2017-07-19 NA
#20 2017-07-20 NA
It can be achieved by data.table
specifying a boolean condition in i and assigning ( :=
) the "Income" value to 0
library(data.table)
setDT(df)[is.na(Revenue) & Date <= max.date, Revenue := 0]
Or using base R
df$Revenue[is.na(df$Revenue) & df$Date <= max.date] <- 0
source to share