Rhs filter with arules / apriori not working
I am using arules :: apriori with a binary matrix and only want to create rules that have one specific column in RHS. This is stated in the documentation, but doesn't seem to work. It's easy enough to filter it after hoc to get this, but I spend a lot of computational time calculating all the rules in the first place.
library(arules) data = data.frame(matrix(rbinom(10000,1, 0.6), nrow=1000)) for(i in 1:ncol(data)) data[,i] = as.factor(data[,i]) dsRules = as(data, "transactions") rules = apriori(dsRules, parameter=list(support = 0.1, minlen = 3, maxlen = 3, target= "rules", confidence = 0.7), appearance = list(rhs = c("X1=1")))
now contain 3378 rules
rules.sub = subset(rules, subset = (rhs %pin% "X1=1"))
rules.sub contains 172 rules
In my actual data, I go from millions of results to ~ 4000, which is a huge difference.
source to share
It turns out I was reading the documentation wrong. If others stumble upon this:
The documentation for rhs is symbolic vectors giving item labels that can only appear in the appropriate place in rules / itemsets. So my code was saying that X1 can only appear in rhs, not that rhs can only contain X1.
To get around this, you specify that all other items are in lhs, like
keep = names(data) keep = keep[-1] #remove 1st feature keepnames = c(paste0(keep, "=1"), paste0(keep, "=0")) rules = apriori(dsRules, parameter=list(support = 0.1, minlen = 3, maxlen = 3, target= "rules", confidence = 0.7), appearance = list(lhs = keepnames))
source to share