Returns a complete array if one row in the array contains the given R value

I have nested data that looks like this:

ID  Date Behavior
1   1    FALSE
1   2    TRUE
1   3    TRUE
2   1    TRUE
2   2    FALSE
3   1    TRUE
3   2    TRUE

      

I would like to return every array of values ​​for a given ID that contains at least one occurrence FALSE

. I expect ID 1 and ID 2 to be returned with each row of data (3 rows for ID 1 and 2 rows for ID2).

EDIT: this is what I expect:

ID  Date Behavior
1   1    FALSE
1   2    TRUE
1   3    TRUE
2   1    TRUE
2   2    FALSE

      

I am wondering if this is a loop for

or a function while

- any help is appreciated ...

Additional points for Python code that mimics R code!

+3


source to share


2 answers


Here's a possible approach data.table

(if df

is your dataset)

library(data.table)
setDT(df)[, .SD[any(!Behavior)], ID] # you can also replace any(!Behavior) with !all(Behavior)
#    ID Date Behavior
# 1:  1    1    FALSE
# 2:  1    2     TRUE
# 3:  1    3     TRUE
# 4:  2    1     TRUE
# 5:  2    2    FALSE

      

Edit: More efficient solution from @Arun

setDT(df)[, if (any(!Behavior)) .SD, ID]

      




Or a similar approach dplyr

library(dplyr)
df %>%
  group_by(ID) %>%
  filter(any(!Behavior))

# Source: local data table [5 x 3]
# Groups: ID
# 
#   ID Date Behavior
# 1  1    1    FALSE
# 2  1    2     TRUE
# 3  1    3     TRUE
# 4  2    1     TRUE
# 5  2    2    FALSE

      

+3


source


This is where the R base is used (assuming your data is in data.frame with a name dd

)



dd[with(dd, ave(!Behavior, ID, FUN=any)), ]

      

+1


source







All Articles