How to parallelize scikit-learn SVM (SVC) .predict () method?

I recently came across a requirement that I have a .fit()

trainable instance of the Classifier and need multiple instances.scikit-learn

SVC

.predict()

Is there a way to parallel only this method with .predict()

any built-in tools scikit-learn

?

from sklearn import svm

data_train = [[0,2,3],[1,2,3],[4,2,3]]
targets_train = [0,1,0]

clf = svm.SVC(kernel='rbf', degree=3, C=10, gamma=0.3, probability=True)
clf.fit(data_train, targets_train)

# this can be very large (~ a million records)
to_be_predicted = [[1,3,4]]
clf.predict(to_be_predicted)

      

If anyone knows a solution, I will be more than happy if you can share it.

+3


source to share


1 answer


This might be a bug, but something like this should do the trick. Basically, break your data into blocks and run your model on each block separately in a loop joblib.Parallel

.



from sklearn.externals.joblib import Parallel, delayed

n_cores = 2
n_samples = to_be_predicted.shape[0]
slices = [
    (n_samples*i/n_cores, n_samples*(i+1)/n_cores))
    for i in range(n_cores)
    ]

results = np.vstack( Parallel( n_jobs = n_cores )( 
    delayed(clf.predict)( to_be_predicted[slices[i_core][0]:slices[i_core][1]
    for i_core in range(n_cores)
    ))

      

+2


source







All Articles