Rhs filter with arules / apriori not working

I am using arules :: apriori with a binary matrix and only want to create rules that have one specific column in RHS. This is stated in the documentation, but doesn't seem to work. It's easy enough to filter it after hoc to get this, but I spend a lot of computational time calculating all the rules in the first place.

Example:

library(arules)
data = data.frame(matrix(rbinom(10000,1, 0.6), nrow=1000))
for(i in 1:ncol(data)) data[,i] = as.factor(data[,i])
dsRules = as(data, "transactions")
rules = apriori(dsRules, 
    parameter=list(support = 0.1, minlen = 3, maxlen = 3, target= "rules", confidence = 0.7), 
    appearance = list(rhs = c("X1=1")))

      

rules

now contain 3378 rules

rules.sub = subset(rules, subset = (rhs %pin% "X1=1"))

      

rules.sub contains 172 rules

In my actual data, I go from millions of results to ~ 4000, which is a huge difference.

+3


source to share


2 answers


Nsfy, there is an easier way to do this. You need to add default='lhs'

like in appearance=list(rhs='X1=1',default='lhs')

. This will limit the rh value to only X1=1

.



+4


source


It turns out I was reading the documentation wrong. If others stumble upon this:

The documentation for rhs is symbolic vectors giving item labels that can only appear in the appropriate place in rules / itemsets. So my code was saying that X1 can only appear in rhs, not that rhs can only contain X1.



To get around this, you specify that all other items are in lhs, like

keep = names(data)
keep = keep[-1] #remove 1st feature
keepnames = c(paste0(keep, "=1"), paste0(keep, "=0"))
rules = apriori(dsRules, 
parameter=list(support = 0.1, minlen = 3, maxlen = 3, target= "rules", confidence = 0.7), 
appearance = list(lhs = keepnames))

      

+2


source







All Articles