R comparing the values ββin the previous lines
I have data like this:
Incident.ID.. = c(rep("INCFI0000029582",4), rep("INCFI0000029587",4))
date = c("2014-09-25 08:39:45", "2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", "2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", "2014-10-10 23:04:00")
status = c("assigned", "in.progress", "resolved", "closed", "assigned", "resolved", "resolved", "closed")
date.diff=c (3, 56, 1347796,0 ,69 ,43, 1348116, 0)
df = data.frame(Incident.ID..,date, status, date.diff, stringsAsFactors = FALSE)
df
Incident.ID.. date status date.diff
1 INCFI0000029582 2014-09-25 08:39:45 assigned 3
2 INCFI0000029582 2014-09-25 08:39:48 in.progress 56
3 INCFI0000029582 2014-09-25 08:40:44 resolved 1347796
4 INCFI0000029582 2014-10-10 23:04:00 closed 0
5 INCFI0000029587 2014-09-25 08:33:32 assigned 69
6 INCFI0000029587 2014-09-25 08:34:41 resolved 43
7 INCFI0000029587 2014-09-25 08:35:24 resolved 1348116
8 INCFI0000029587 2014-10-10 23:04:00 closed 0
And I would like to select only rows with "resolved" status for a specific incident .ID .. when it is not followed by the status of the same incident .ID .. "closed" (there can be only rows with "allowed" or only "closed", so that somehow Incident.ID .. should be the same when comparing).
For example, here in the data for this example, only this line will be selected:
6 INCFI0000029587 2014-09-25 08:34:41 resolved 43
So how can I do this?
+3
source to share
2 answers
Here's a simple approach using dplyr to group the data using the incident id and then filter (select rows) using the "lead" function to find the next row:
library(dplyr)
df %>%
group_by(Incident.ID..) %>%
filter(status == "resolved" & lead(status) != "closed") # you can add %>% ungroup() if required
#Source: local data frame [1 x 4]
#Groups: Incident.ID..
#
# Incident.ID.. date status date.diff
#1 INCFI0000029587 2014-09-25 08:34:41 resolved 43
+3
source to share
library(data.table) #using the development version of data.table
setDT(df)[, .SD[status == "resolved" & shift(status, type = "lead") != "closed"], by = Incident.ID..]
Incident.ID.. date status date.diff
1: INCFI0000029587 2014-09-25 08:34:41 resolved 43
PS updated as per @David comment
+3
source to share