How to extract information from apriori R (association rules)
I am doing some association rules in R and want to extract my results so that I can generate reports my results look like this:
> inspect(rules[1:3])
lhs rhs support confidence lift
1 {apples} => {oranges} 0.00029 0.24 4.4
2 {apples} => {pears} 0.00022 0.18 45.6
3 {apples} => {pineapples} 0.00014 0.12 1.8
How do I extract "rhs" here, i.e. vector oranges, pears and pineapples
Next, how do I extract information from a summary ie
> summary(rules)
The datatype is "s4" and has no problem fetching when the output is in a list, etc., how do you make equivalence? set of 3 rules
rule length distribution (lhs + rhs):sizes
2
3
Min. 1st Qu. Median Mean 3rd Qu. Max.
2 2 2 2 2 2
I want to extract "3" from the "set of three rules"
I've gotten to using "@" What does the @ symbol mean in R?
But once I use this, how can I turn my results into a vector ie
inspect(rules@rhs)
1 {oranges}
2 {pears}
3 {pineapples}
becomes a character vector of length 3
source to share
inspect
returns nothing, just printing its output. When this happens, you can use the function capture.output
if you want to store the output as a string. For example, having receivedrhs
data(Adult)
rules <- apriori(Adult, parameter = list(support = 0.4))
inspect(rules[1:3])
# lhs rhs support confidence lift
# 1 {} => {race=White} 0.8550428 0.8550428 1
# 2 {} => {native-country=United-States} 0.8974243 0.8974243 1
# 3 {} => {capital-gain=None} 0.9173867 0.9173867 1
## Capture it, and extract rhs
out <- capture.output(inspect(rules[1:3]))
gsub("[^{]+\\{([^}]*)\\}[^{]+\\{([^}]*)\\}.*", "\\2", out)[-1]
# [1] "race=White" "native-country=United-States"
# [3] "capital-gain=None"
However, it looks like you can simply access this information from rules
using the functionrhs
str(rhs(rules)@itemInfo)
# 'data.frame': 115 obs. of 3 variables:
# $ labels :Class 'AsIs' chr [1:115] "age=Young" "age=Middle-aged" "age=Senior" "age=Old" ...
# $ variables: Factor w/ 13 levels "age","capital-gain",..: 1 1 1 1 13 13 13 13 13 13 ...
# $ levels : Factor w/ 112 levels "10th","11th",..: 111 63 92 69 30 54 65 82 90 91 ...
In general, use str
to see what objects are created so you can decide how to extract components.
source to share
To answer your second question: length(rules)
Now about your first question:
library("arules")
data("Adult")
## Mine association rules.
rules <- apriori(Adult,parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
summary(rules)
l = length(rules)
everything = labels(rules)
#print(everything)
cut = unlist(strsplit(everything,"=> "))[seq(2,2*l,by=2)]
print(cut)
Feel free if you have a question, it might be a little tight :-)
source to share
You can extract the RHS as a character vector of element names (without extraneous text like "=>" or curly braces) like this:
rules@rhs@itemInfo$labels[rules@rhs@data@i+1]
The index values ββstored in rules@rhs@data@i
range from 0 to one less than the number of unique labels. So indexing the labels requires adding '1' to avoid trying to access the 0th element rules@rhs@itemInfo$labels
.
source to share