Counting dates that don't exist

I am working on a dataframe that contains 2 columns like this:

    time        frequency
  2014-01-06       13
  2014-01-07       30
  2014-01-09       56

      

My problem is that I am interested in counting the days that the frequency is 0. The data is pulled using RPostgreSQL / RSQLite, so there is no datetime if there is no value (i.e. if the frequency is at least 1). If I was interested in counting these dates that don't actually exist in the dataframe, is there an easy way to do this? I.E. If we consider the date range 2014-01-01 to 20-14-01-10, I would like it to count 7

My only thought was that brute force would create a separate data frame with each date (note that these are 4+ years of dates, which would be a huge start) and then merging the two data and counting the number of NA values. I'm sure there is a more elegant solution than what I was thinking.

Thank!

+3


source to share


1 answer


Sort by date and then search for spaces.

start <- as.Date("2014-01-01")
time <- as.Date(c("2014-01-06", "2014-01-07","2014-01-09"))
end <- as.Date("2014-01-10")

time <- sort(unique(time))

# Include start and end dates, so the missing dates are 1/1-1/5, 1/8, 1/10
d <- c(time[1]- start,
       diff(time) - 1,
       end - time[length(time)] )

d # [1] 5 0 1 1
sum(d) # 7 missing days

      



And now what days are not enough ...

(gaps <- data.frame(gap_starts = c(start,time+1)[d>0],
                    gap_length = d[d>0]))
#   gap_starts gap_length
# 1 2014-01-01          5
# 2 2014-01-08          1
# 3 2014-01-10          1    

for (g in 1:nrow(gaps)){
  start=gaps$gap_starts[g]
  length=gaps$gap_length[g]
  for(i in start:(start+length-1)){
    print(as.Date(i, origin="1970-01-01"))
  }
}
# [1] "2014-01-01"
# [1] "2014-01-02"
# [1] "2014-01-03"
# [1] "2014-01-04"
# [1] "2014-01-05"
# [1] "2014-01-08"
# [1] "2014-01-10"

      

+9


source







All Articles