Unique frequency
Im trying to find the number of times a unique pattern is found in patterns with a matching p value
df1 <- read.table(text="
Gene id Seg.mean pValue CNA
Nfib 8410 0.3108 1.381913 gain
Mycl 8410 2.7320 1.182842 gain
Mycl 8410 2.7320 1.846275 gain
Nfib 8411 0.5920 1.381913 gain
Nfib 8411 1.3090 1.381913 gain
Mycl 8412 1.6150 5.765442 gain
Mycl 8411 1.6150 1.846275 gain
",header=TRUE)
expected output
Gene ID Freq. of id pValue
Nfib 8410,8411 2 1.381913
Mycl 8410,8411,8412 3 1.182842,1.846275,5.765442
+3
source to share
3 answers
sol'n:
library(dplyr)
df1 %>%
group_by(Gene) %>%
summarise(ID = paste0(unique(id), collapse=", "),
pval = paste0(unique(pValue),collapse=", "),
n = n_distinct(id))
result:
Gene ID pval n
1 Mycl 8410, 8412, 8411 1.182842, 1.846275, 5.765442 3
2 Nfib 8410, 8411 1.381913 2
breakdown:
- we want to rate by
Gene
(unit of analysis) and thereforegroup_by(Gene)
. - then create new variables that match
paste0(var,collapse=", ")
. This applies toGene
. - count the number of different identifiers. Applies again for
Gene
.
+2
source to share