Determine which list objects are contained (subset) in another list in R

Thanks for your kind answer to my previous questions. I have two lists: list1 and list2. I would like to know if every list1 object is contained in every list2 object. For example:

> list1
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

> list2
[[1]]
[1] 1 2 3

[[2]]
[1] 2 3

[[3]]
[1] 2 3

      

Here are my questions: 1.) How do you ask R to check if an object is a subset of another object in the list? For example, I would like to check if list2[[3]]={2,3}

(subset) contains list1[[2]]={2}

. When I do list2[[3]] %in% list1[[2]]

, I receive [1] TRUE FALSE

. However, this is not what I want to do ?! I just want to check if it is a list2[[3]]

subset list1[[2]]

i.e. Is {2,3} \ subset of {3} like in theoretical set? I don't want to do an elemental check, as R seems to work with the% in% command. Any suggestions?

2.) Is there a way to efficiently perform all pairwise comparisons of subsets (i.e., list1[[i]]

subset list2[[j]]

, for all combinations i,j

? Will something like outer(list1,list2, func.subset)

work after answering question number 1? Thank you for your feedback!

+3


source to share


4 answers


setdiff

compares unique values

length(setdiff(5, 1:5)) == 0

      

Alternatively all(x %in% y)

will work well.



To do all the comparisons, something like this will work:

dt <- expand.grid(list1,list2)
dt$subset <- apply(dt,1, function(.v) all(.v[[1]] %in% .v[[2]]) )


  Var1    Var2 subset
1    1 1, 2, 3   TRUE
2    2 1, 2, 3   TRUE
3    3 1, 2, 3   TRUE
4    1    2, 3  FALSE
5    2    2, 3   TRUE
6    3    2, 3   TRUE
7    1    2, 3  FALSE
8    2    2, 3   TRUE
9    3    2, 3   TRUE

      

Note that this is expand.grid

not the fastest way to do this when dealing with a lot of data (dwin's solution is better in this regard), but it allows you to quickly check visually if it does what you want.

+5


source


You can use the package sets

like this:

library(sets)
is.subset <- function(x, y) as.set(x) <= as.set(y)

outer(list1, list2, Vectorize(is.subset))
#      [,1]  [,2]  [,3]
# [1,] TRUE FALSE FALSE
# [2,] TRUE  TRUE  TRUE
# [3,] TRUE  TRUE  TRUE

      



@Michael or @WWin's basic version would work just as well, but for the second part of your question, I would say this outer

is the way to go.

+2


source


is.subset <- function(x,y) {length(setdiff(x,y)) == 0}

      

First, a combo of list1 elements that are subsets of list2 elements:

> sapply(1:length(list1), function(i1) sapply(1:length(list2), 
                 function(i2) is.subset(list1[[i1]], list2[[i2]]) ) )
      [,1] [,2] [,3]
[1,]  TRUE TRUE TRUE
[2,] FALSE TRUE TRUE
[3,] FALSE TRUE TRUE

      

Then it is not surprising that any of the elements of list2 (all length> 1) that are subsets of a list of one element (all length 1) are missing:

> sapply(1:length(list1), function(i1) sapply(1:length(list2), 
                 function(i2) is.subset(list2[[i2]], list1[[i1]]) ) )
      [,1]  [,2]  [,3]
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE FALSE FALSE

      

+1


source


Adding to @ Michael's, here's a neat way to avoid the expand.grid clutter with the AsIs function:

list2 <- list(1:3,2:3,2:3)
a <- data.frame(list1 = 1:3, I(list2))
a$subset <- apply(a, 1, function(.v) all(.v[[1]] %in% .v[[2]]) )

  list1   list2 subset
1     1 1, 2, 3   TRUE
2     2    2, 3   TRUE
3     3    2, 3   TRUE

      

0


source







All Articles