R - How to create a stacker ensemble?

I need to create a stacker ensemble, combine each generalized accuracy percentage with each classifier with a new classifier

NBayes

Result = 0.61% accuracy

      

K-NN (k = 5)

Result = 0.63% accuracy

      

K-NN (k = 10)

Result = 0.64% accuracy

      

Decision tree

Result = 0.60% accuracy

      

Logistic regression

Result = 0.62% accuracy

      

to classify this 5 percent?

or I need to concatenate the output of many predictions for example something like a table:

NB   k = 5  k = 10  dectree   Logistic   TrueLabel    
bob    1      1      bob       FALSE       bob
bob    2      2      john      TRUE        john
bob    1      1      bob       TRUE        bob

      

if so, it matters if the outputs are different. I.E should they be either bob or john instead of true or false or 1 or 2?

which classifier should be used to combine them?

+3


source to share


1 answer


To create a styling ensemble, you need to use the table created at the end of your question, that is:

NB   k = 5  k = 10  dectree   Logistic   TrueLabel    
bob    1      1      bob       FALSE       bob
bob    2      2      john      TRUE        john
bob    1      1      bob       TRUE        bob

      

The answer to should it be either bob or john instead of true or false or 1 or 2? is that it depends on the model that you will use to combine the individual models. Most models inr

work with the factors, in which case they are kept as good as possible. Make sure your first and second columns (with numeric values) are also treated as factors, otherwise they will be treated as numbers and you don't want that (many models will create dummy variables from the factor and if your column is numeric then this will not happen). To summarize these use factors for all of the above columns, but read the documentation on the combined model (more on that later) to see if it takes factors as input.



For another question about which model you need to use to combine inputs, the answer is "any model you like . " It is common practice to use simple logistic regression, but that doesn't stop you from choosing anything else you like. The idea is to use the original variables (the ones you used to train the individual models) plus the table above (i.e. the predictions of the individual models) and see if the new precision is better than the individual ones. In the new combined model, you can use feature elimination techniques such as forward or backward selection to remove insignificant variables.

I hope this answers your questions.

+1


source







All Articles