How to extract information from apriori R (association rules)

Question

How to extract information from apriori R (association rules)

I am doing some association rules in R and want to extract my results so that I can generate reports my results look like this:

> inspect(rules[1:3])
  lhs          rhs                         support confidence lift
1 {apples} => {oranges}                    0.00029       0.24  4.4
2 {apples} => {pears}                      0.00022       0.18 45.6
3 {apples} => {pineapples} 0.00014         0.12  1.8

How do I extract "rhs" here, i.e. vector oranges, pears and pineapples

Next, how do I extract information from a summary ie

> summary(rules)

The datatype is "s4" and has no problem fetching when the output is in a list, etc., how do you make equivalence? set of 3 rules

rule length distribution (lhs + rhs):sizes
2 
3 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      2       2       2       2       2       2

I want to extract "3" from the "set of three rules"

I've gotten to using "@" What does the @ symbol mean in R?

But once I use this, how can I turn my results into a vector ie

inspect(rules@rhs)
1 {oranges}
2 {pears}
3 {pineapples}

becomes a character vector of length 3

+3

r s4 apriori

shecode 31 jul. 15 at 2:18

source to share

3 answers

To answer your second question: length(rules)

Now about your first question:

library("arules")
data("Adult")
## Mine association rules.
rules <- apriori(Adult,parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
summary(rules)

l = length(rules)

everything = labels(rules)
#print(everything)

cut = unlist(strsplit(everything,"=> "))[seq(2,2*l,by=2)]
print(cut)

Feel free if you have a question, it might be a little tight :-)

+2

steph 31 jul. 15 at 3:01

source to share

You can extract the RHS as a character vector of element names (without extraneous text like "=>" or curly braces) like this:

rules@rhs@itemInfo$labels[rules@rhs@data@i+1]

The index values stored in rules@rhs@data@i

range from 0 to one less than the number of unique labels. So indexing the labels requires adding '1' to avoid trying to access the 0th element rules@rhs@itemInfo$labels

.

0

MCornejo 11 oct. 17 at 16:44

source to share

jenesaisquoi · Accepted Answer · 2015-07-31T03:01:33+0000

inspect

returns nothing, just printing its output. When this happens, you can use the function capture.output

if you want to store the output as a string. For example, having receivedrhs

data(Adult)
rules <- apriori(Adult, parameter = list(support = 0.4))
inspect(rules[1:3])
#   lhs    rhs                              support confidence lift
# 1 {}  => {race=White}                   0.8550428  0.8550428    1
# 2 {}  => {native-country=United-States} 0.8974243  0.8974243    1
# 3 {}  => {capital-gain=None}            0.9173867  0.9173867    1

## Capture it, and extract rhs
out <- capture.output(inspect(rules[1:3]))
gsub("[^{]+\\{([^}]*)\\}[^{]+\\{([^}]*)\\}.*", "\\2", out)[-1]
# [1] "race=White"                   "native-country=United-States"
# [3] "capital-gain=None"

However, it looks like you can simply access this information from rules

using the functionrhs

str(rhs(rules)@itemInfo)
# 'data.frame': 115 obs. of  3 variables:
#  $ labels   :Class 'AsIs'  chr [1:115] "age=Young" "age=Middle-aged" "age=Senior" "age=Old" ...
#  $ variables: Factor w/ 13 levels "age","capital-gain",..: 1 1 1 1 13 13 13 13 13 13 ...
#  $ levels   : Factor w/ 112 levels "10th","11th",..: 111 63 92 69 30 54 65 82 90 91 ...

In general, use str

to see what objects are created so you can decide how to extract components.

How to extract information from apriori R (association rules)

More articles: