How to integrate R code into sklearn Pipeline?

Question

How to integrate R code into sklearn Pipeline?

I have a complex approach with separate models and a Stacker on top of them. How:

# GLMNET
glmnet_pipe = Pipeline([
    ("DATA_CLEANER", DataCleaner(demo='HH_F', mode='strict')),
    ("DATA_ENCODING", Encoder(encoder_name='code')),
    ("MODELLING", glm)
])
# XGBoost 
xgb_1_pipe = Pipeline([
    ("DATA_CLEANER", DataCleaner(demo='HH_F', mode='strict')),
    ("DATA_ENCODING", Encoder(encoder_name='code')),
    ("SCALE", Normalizer(normalizer=NORMALIZE)),
    ("FEATURE_SELECTION", huber_feature_selector),
    ("MODELLING", xgb_1)
])
# set of our models
base_models = [glmnet_pipe, xgb1_pipe]
# using Stacker on top of those ones
StackingRegressor(
  regressors=base_models,
  meta_regressor=SVR()
)

However, I also have a pipeline implemented in R with a package forecast

. All of my R results are slightly "unique" in terms of speed of execution / rewriting in Python.

Are there any approaches for including R-code in sklearn

?

So far, I see the following possibility:

import subprocess

CustomRModel = class():

  def __init__(self, path, args):
    self.path = path
    self.args = args
    self.cmd = ['RScript', self.path] + self.args

  def fit(self, X, Y):
    # call fit in R
    subprocess.check_output(self.cmd, universal_newlines=True)
    # read ouput of R.csv to Python dataframe
    # pd.read_csv
    return self

  def predict(X):
    # call predict in R
    subprocess.check_output(self.cmd, universal_newlines=True)
    # read ouput of R.csv to Python dataframe
    # pd.read_csv
    # calculate predict
    return predict

and then use this class as a normal step in Pipeline.

Or maybe you know more interesting approaches?

+3

python scikit-learn r time-series pipeline

SpanishBoy Apr 13. 17 at 20:08

source to share

No one has answered this question yet

Check out similar questions:

5116

How can I check if a file exists without exceptions?

4268

How to combine two dictionaries in one expression?

3790

How can I safely create a subdirectory?

3474

How to list all files in a directory?

3428

How to sort a dictionary by value?

3235

How to check if a list is empty?

2621

How do I create a chain of function decorators?

2601

How can I make a time delay in Python?

2568

How to find the current time in Python

2474

How to make a great R reproducible example

How to integrate R code into sklearn Pipeline?

More articles: