How to pass keyword argument to prediction method in sklearn pipeline
I am using GaussianProcess
inside a Pipeline
. The method predict
GaussianProcess
takes keyword arguments to its method predict
called batch_size
which I need to use to prevent it from filling up my memory.
Is there a way to pass this argument to an instance GaussianProcess
when called predict
through a configured pipeline?
Here is a minimal example adapted from sklearn documentation to demonstrate what I want:
import numpy as np
from sklearn.gaussian_process import GaussianProcess
from matplotlib import pyplot as pl
np.random.seed(1)
def f(x):
"""The function to predict."""
return x * np.sin(x)
X = np.atleast_2d([1., 3., 5., 6., 7., 8.]).T
y = f(X).ravel()
gp = GaussianProcess(corr='cubic', theta0=1e-2, thetaL=1e-4, thetaU=1e-1,
random_start=100)
gp.fit(X, y)
x = np.atleast_2d(np.linspace(0, 10, 1000)).T
y_pred = gp.predict(x, batch_size=10)
from sklearn import pipeline
steps = [('gp', gp)]
p = pipeline.Pipeline(steps)
# How to pass the batch_size here?
p.predict(x)
source to share
While you can add feed parameters for fit
and fit_transform
pipeline methods, this is not possible for predict
. See this line and the following in the version code 0.15
.
You can have monkeypatch with
from functools import partial
gp.predict = partial(gp.predict, batch_size=10)
or if that doesn't work then
pipeline.steps[-1][-1].predict = partial(pipeline.steps[-1][-1].predict, batch_size=10)
source to share
You can work around this by allowing keyword arguments to be passed **predict_params
to the Pipeline prediction method.
from sklearn.pipeline import Pipeline
class Pipeline(Pipeline):
def predict(self, X, **predict_params):
"""Applies transforms to the data, and the predict method of the
final estimator. Valid only if the final estimator implements
predict."""
Xt = X
for name, transform in self.steps[:-1]:
Xt = transform.transform(Xt)
return self.steps[-1][-1].predict(Xt, **predict_params)
source to share