Get or subset of first 5 minutes of every day of data from xts
I would like a subset of the first 5 minutes of time series data for each day from the detailed data, however the first 5 minutes does not happen at the same time every day, so using something like xtsobj["T09:00/T09:05"]
this will not work as the first 5 minutes of changes start. that is, sometimes it starts at 9:20 am or some other random time in the morning instead of 9 am.
So far, I could multiply the first minute for each day using a function like:
k <- diff(index(xtsobj))> 10000
xtsobj[c(1, which(k)+1)]
i.e. finding gaps in data greater than 10,000 seconds, but going from that to finding the first 5 minutes of each day is more difficult because the data is not always evenly distributed. That is, there can be 2 to 5 lines between the first minute and the 5th minute, and thus use something like:
xtsobj[c(1, which(k)+6)]
and then binding the results together
not always accurate. I was hoping that a function like "first" could be used, but didn’t know how to do it for a few days, maybe it might be the optimal solution. Is there a better way to get this information?
Many thanks to the stackoverflow community in advance.
source to share
split(xtsobj, "days")
will create a list with an xts object for each day.
Then you can apply head
to every day
lapply(split(xtsobj, "days"), head, 5)
or in general
lapply(split(xtsobj, "days"), function(x) {
x[1:5, ]
})
Finally, you can rbind
spend days back if you like.
do.call(rbind, lapply(split(xtsobj, "days"), function(x) x[1:5, ]))
source to share
How about you use the package lubridate
, first find out the starting point every day you think the random case changes and then use the functionminutes
So it would be something like:
five_minutes_after = starting_point_each_day + minutes(5)
Then you can use the normal subset xts
by doing something like:
5_min_period = paste(starting_point_each_day,five_minutes_after,sep='/')
xtsobj[5_min_period]
Edit:
@Joshua I think it works, have a look at this example:
library(lubridate)
x <- xts(cumsum(rnorm(20, 0, 0.1)), Sys.time() - seq(60,1200,60))
starting_point_each_day= index(x[1])
five_minutes_after = index(x[1]) + minutes(5)
five_min_period = paste(starting_point_each_day,five_minutes_after,sep='/')
x[five_min_period]
In my previous example, I made a mistake, I set five_min_period between the quotes. Is that what you pointed to Joshua? Also maybe no starting point is needed, just:
until5min=paste('/',five_minutes_after,sep="")
x[until5min]
source to share