Adding missing times

Question

Adding missing times

I have a table that gives me the date I received the data and the amount of how much data was received in a thirty minute interval. My problem is that you are missing half an hour and I want to insert them into column and then insert into column 0.

Here's an example of what the table looks like:

Date-Time           Count
2017-07-13 17:30:00 111

2017-07-13 18:00:00 85

2017-07-13 20:00:00 127

2017-07-13 20:30:00 515

I want it to have 18:30:00 0, etc.

Not sure how to go about this, if anyone has an idea which would be great.

Here's what I tried to do:

starttime <- df[1,`Date-Time`]

for (i in df){
  time <- starttime + 30
  new_dt$datetime <- ifelse(df[i] = time, df$datetime, time)
  new_dt$count <- ifelse(df[i] = time, df$count, 0)
}

+3

r

Davie D Jul 25 17 at 17:58

source to share

3 answers

Andrew Brēza · Answer 1 · 2017-07-25T18:24:45+0000

Let's create some dummy data first.

library(tidyverse)
library(lubridate)

time_series <- tibble(
  DateTime = c(
    "2017-07-13 17:30:00",
    "2017-07-13 18:00:00",
    "2017-07-13 20:00:00",
    "2017-07-13 20:30:00"
  ),
  Count = c(111, 85, 127, 515)
) %>%
  mutate(DateTime = ymd_hms(DateTime))

Now, let's figure out the smallest and largest data we have in the data.

from <- min(time_series$DateTime)
to <- max(time_series$DateTime)

Finally, create a sequence of dates from from

to to

in 30 minute intervals. We then append the existing data to this sequence and replace any missing values Count

with zero.

tibble(DateTime = seq(from = from, to = to, by = 1800)) %>%
  left_join(time_series) %>%
  mutate(Count = ifelse(is.na(Count), 0, Count))

Dave gruenewald · Answer 2 · 2017-07-25T19:03:59+0000

While this works, I think your best bet is to use the package padr

:

library(dplyr)
library(padr)

pad_df <- df %>% 
  pad(interval = '30 mins')

If you prefer from 0

to NA

', just:

pad_df[is.na(pad_df)] <- 0

The package padr

also has a function thicken

if you need to switch quickly and smoothly to a lower frequency.

paris vignette

Rui barradas · Answer 3 · 2017-07-25T18:26:26+0000

First of all, I changed the column name Date-Time

to Date.Time

.

#dput(dat)
dat <-
structure(list(Date.Time = structure(c(1499963400, 1499965200, 
1499972400, 1499974200), class = c("POSIXct", "POSIXt"), tzone = ""), 
    Count = c(111L, 85L, 127L, 515L)), .Names = c("Date.Time", 
"Count"), row.names = c(NA, -4L), class = "data.frame")

Now the trick is to use seq.POSIXct

to create a df with only one column, then merge

two dfs.

tmp <- data.frame(
    Date.Time = seq(min(dat$Date.Time), max(dat$Date.Time), by = "30 min"))
tmp
            Date.Time
1 2017-07-13 17:30:00
2 2017-07-13 18:00:00
3 2017-07-13 18:30:00
4 2017-07-13 19:00:00
5 2017-07-13 19:30:00
6 2017-07-13 20:00:00
7 2017-07-13 20:30:00

merge(dat, tmp, all.y = TRUE)
            Date.Time Count
1 2017-07-13 17:30:00   111
2 2017-07-13 18:00:00    85
3 2017-07-13 18:30:00    NA
4 2017-07-13 19:00:00    NA
5 2017-07-13 19:30:00    NA
6 2017-07-13 20:00:00   127

If you want you can rm(tmp)

.

Adding missing times

More articles: