Conditional assertion in dplyr / tidyverse functions to avoid comparisons between same factor levels
I have a data clock:
data = read.table(text = "region plot species
1 1A A_B
1 1A A_B
1 1B B_C
1 1C A_B
1 1D C_D
2 2A B_C
2 2A B_C
2 2A E_F
2 2B B_C
2 2B E_F
2 2C E_F
2 2D B_C
3 3A A_B
3 3B A_B", stringsAsFactors = FALSE, header = TRUE)
I wanted to compare each level plot
to get the number of unique matches species
between the two plot comparisons. However, I don't want to compare the same plots (i.e., remove / exclude 1A_1A or 1B_1B or 2C_2C, ect.). The result for this example should look like this:
output<-
region plot freq
1 1A_1B 0
1 1A_1C 1
1 1A_1D 0
1 1B_1C 0
1 1B_1D 0
1 1C_1D 0
2 2A_2B 2
2 2A_2C 1
2 2A_2D 1
2 2B_2C 1
2 2B_2D 1
2 2C_2D 0
3 3A_3B 1
I adapted the following code from @HubertL, Convert Matrix List to One Data Frame but try to include an appropriate if else statement to satisfy this condition:
library(tidyverse)
data %>% group_by(region, species) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(ifelse(plot[i] = plot[i], freq =
length(unique((species),)
+1
source to share
1 answer
You can filter out duplicates by adding filter(!duplicated(plot))
:
data %>% group_by(region, species) %>%
filter(!duplicated(plot)) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(freq=n())
region y freq
<int> <chr> <int>
1 1 1A_1C 1
2 2 2A_2B 2
3 2 2A_2C 1
4 2 2A_2D 1
5 2 2B_2C 1
6 2 2B_2D 1
7 3 3A_3B 1
0
source to share