What coefficients go to which class in multiclassical logistic regression in scikit to learn?

I am using scikit learn Logistic Regression for a multiclass problem.

logit = LogisticRegression(penalty='l1')
logit = logit.fit(X, y)

      

I am wondering what functions are driving this solution.

logit.coef_

      

The above gives me a lovely data formatted (n_classes, n_features)

, but all the classes and function names are gone. With functions, that's fine, because the assumption that they are indexed the same way I passed them seems safe ...

But with classes this is a problem, since I have never explicitly passed classes in any order. So which class does the coefficient sets (rows in the dataframe) 0, 1, 2, and 3 belong to?

+3


source to share


1 answer


The order will be the same as the one returned logit.classes_

(classes_ is an attribute of the model, which represents the unique classes present in y), and basically they will be in alphabetical order in the case of strings.

To explain this, we mentioned y labels on a random dataset with LogisticRegression:

import numpy as np
from sklearn.linear_model import LogisticRegression

X = np.random.rand(45,5)
y = np.array(['GR3', 'GR4', 'SHH', 'GR3', 'GR4', 'SHH', 'GR4', 'SHH',
              'GR4', 'WNT', 'GR3', 'GR4', 'GR3', 'SHH', 'SHH', 'GR3', 
              'GR4', 'SHH', 'GR4', 'GR3', 'SHH', 'GR3', 'SHH', 'GR4', 
              'SHH', 'GR3', 'GR4', 'GR4', 'SHH', 'GR4', 'SHH', 'GR4', 
              'GR3', 'GR3', 'WNT', 'SHH', 'GR4', 'SHH', 'SHH', 'GR3',
              'WNT', 'GR3', 'GR4', 'GR3', 'SHH'], dtype=object)

lr = LogisticRegression()
lr.fit(X,y)

# This is what you want
lr.classes_

#Out:
#    array(['GR3', 'GR4', 'SHH', 'WNT'], dtype=object)

lr.coef_
#Out:
#    array of shape [n_classes, n_features]

      



So, in the matrix, coef_

index 0 in the rows represents "GR3" (first class in the array classes_

, 1 = "GR4", etc.)

Hope it helps.

+5


source







All Articles