Deleting data between specific dates in r

I can use the polygon function in R to indicate on the shape which days I would like to exclude from my data:

require(gamair)
data(cairo)
data1 <- within(cairo, Date <- as.Date(paste(year, month, day.of.month, sep = "-")))
data1 <- data1[,c('Date','temp')]
plot(data1)
dd <- data.frame(year = seq(1995,2005),
                 istart = c(341,355,356,370,371,380,360,400,378,360,360),
                 iend = c(450,400,380,390,420,410,425,450,421,430,400))

dates <- paste(dd[,1], '-01', '-01', sep = '')
istart <- as.Date(dates) + dd[,2]
iend <- as.Date(dates) + dd[,3]

for (i in 1:length(iend)){
  polygon(c(istart[i],iend[i],iend[i],istart[i]),c(0,0,110,110),
          col=rgb(1, 0, 0,0.5), border=NA)
}

      

enter image description here

Now I am wondering if it is possible to remove these highlighted times from data_1 to generate a new time series data_2 that does not include these highlighted values?

I can remove individual days specified in istart and iend, but cannot remove the range of values ​​between those dates. How can I do that?

+3


source to share


2 answers


You can try the following code:

ret <- rep(FALSE, NROW(data1))
for (i in seq_along(istart)) {
    ret <- ret | ((data1$Date >= istart[i]) & (data1$Date <= iend[i]))
}
data2 <- data1[!ret, ]
plot(data2, pch = ".")
for (i in 1:length(iend)){
  polygon(c(istart[i],iend[i],iend[i],istart[i]),c(0,0,110,110),
          col=rgb(1, 0, 0,0.5), border=NA)
}

      

So, for each value istart

and iend

you create a vector of boolean values ​​of all values ​​that are within one of these intervals. Then all you have to do is select all the lines data1

that are not in these intervals.



(I changed the construction symbol to .

to make it more visible so that all values ​​are filtered out)

enter image description here

+1


source


Using mapply, you can define the vector of dates that you want to exclude from your data.

exclude = unlist(mapply(function(istart, iend) {seq(istart, iend, "days")}, istart, iend))
data1 = data1[!(data1$Date %in% exclude), ]

      



Also, there is a shorter way of defining your istart and iend vectors:

istart = seq(as.Date("1995-01-01"), as.Date("2005-01-01"), "years") + c(341,355,356,370,371,380,360,400,378,360,360)
iend = seq(as.Date("1995-01-01"), as.Date("2005-01-01"), "years") + c(450,400,380,390,420,410,425,450,421,430,400))

      

0


source







All Articles