Returns a complete array if one row in the array contains the given R value
I have nested data that looks like this:
ID Date Behavior
1 1 FALSE
1 2 TRUE
1 3 TRUE
2 1 TRUE
2 2 FALSE
3 1 TRUE
3 2 TRUE
I would like to return every array of values for a given ID that contains at least one occurrence FALSE
. I expect ID 1 and ID 2 to be returned with each row of data (3 rows for ID 1 and 2 rows for ID2).
EDIT: this is what I expect:
ID Date Behavior
1 1 FALSE
1 2 TRUE
1 3 TRUE
2 1 TRUE
2 2 FALSE
I am wondering if this is a loop for
or a function while
- any help is appreciated ...
Additional points for Python code that mimics R code!
source to share
Here's a possible approach data.table
(if df
is your dataset)
library(data.table)
setDT(df)[, .SD[any(!Behavior)], ID] # you can also replace any(!Behavior) with !all(Behavior)
# ID Date Behavior
# 1: 1 1 FALSE
# 2: 1 2 TRUE
# 3: 1 3 TRUE
# 4: 2 1 TRUE
# 5: 2 2 FALSE
Edit: More efficient solution from @Arun
setDT(df)[, if (any(!Behavior)) .SD, ID]
Or a similar approach dplyr
library(dplyr)
df %>%
group_by(ID) %>%
filter(any(!Behavior))
# Source: local data table [5 x 3]
# Groups: ID
#
# ID Date Behavior
# 1 1 1 FALSE
# 2 1 2 TRUE
# 3 1 3 TRUE
# 4 2 1 TRUE
# 5 2 2 FALSE
source to share