How to extract information from apriori R (association rules)

I am doing some association rules in R and want to extract my results so that I can generate reports my results look like this:

> inspect(rules[1:3])
  lhs          rhs                         support confidence lift
1 {apples} => {oranges}                    0.00029       0.24  4.4
2 {apples} => {pears}                      0.00022       0.18 45.6
3 {apples} => {pineapples} 0.00014         0.12  1.8

      

How do I extract "rhs" here, i.e. vector oranges, pears and pineapples

Next, how do I extract information from a summary ie

> summary(rules)

      

The datatype is "s4" and has no problem fetching when the output is in a list, etc., how do you make equivalence? set of 3 rules

rule length distribution (lhs + rhs):sizes
2 
3 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      2       2       2       2       2       2 

      

I want to extract "3" from the "set of three rules"

I've gotten to using "@" What does the @ symbol mean in R?

But once I use this, how can I turn my results into a vector ie

inspect(rules@rhs)
1 {oranges}
2 {pears}
3 {pineapples}

      

becomes a character vector of length 3

+3


source to share


3 answers


inspect

returns nothing, just printing its output. When this happens, you can use the function capture.output

if you want to store the output as a string. For example, having receivedrhs

data(Adult)
rules <- apriori(Adult, parameter = list(support = 0.4))
inspect(rules[1:3])
#   lhs    rhs                              support confidence lift
# 1 {}  => {race=White}                   0.8550428  0.8550428    1
# 2 {}  => {native-country=United-States} 0.8974243  0.8974243    1
# 3 {}  => {capital-gain=None}            0.9173867  0.9173867    1

## Capture it, and extract rhs
out <- capture.output(inspect(rules[1:3]))
gsub("[^{]+\\{([^}]*)\\}[^{]+\\{([^}]*)\\}.*", "\\2", out)[-1]
# [1] "race=White"                   "native-country=United-States"
# [3] "capital-gain=None"           

      

However, it looks like you can simply access this information from rules

using the functionrhs



str(rhs(rules)@itemInfo)
# 'data.frame': 115 obs. of  3 variables:
#  $ labels   :Class 'AsIs'  chr [1:115] "age=Young" "age=Middle-aged" "age=Senior" "age=Old" ...
#  $ variables: Factor w/ 13 levels "age","capital-gain",..: 1 1 1 1 13 13 13 13 13 13 ...
#  $ levels   : Factor w/ 112 levels "10th","11th",..: 111 63 92 69 30 54 65 82 90 91 ...

      

In general, use str

to see what objects are created so you can decide how to extract components.

+2


source


To answer your second question: length(rules)

Now about your first question:



library("arules")
data("Adult")
## Mine association rules.
rules <- apriori(Adult,parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
summary(rules)

l = length(rules)

everything = labels(rules)
#print(everything)

cut = unlist(strsplit(everything,"=> "))[seq(2,2*l,by=2)]
print(cut)

      

Feel free if you have a question, it might be a little tight :-)

+2


source


You can extract the RHS as a character vector of element names (without extraneous text like "=>" or curly braces) like this:

rules@rhs@itemInfo$labels[rules@rhs@data@i+1]

      

The index values ​​stored in rules@rhs@data@i

range from 0 to one less than the number of unique labels. So indexing the labels requires adding '1' to avoid trying to access the 0th element rules@rhs@itemInfo$labels

.

0


source







All Articles