Partial minimum area Difference expressed by components in sklearn

I am trying to do PLSRegression using code from sklearn and I want to keep those components that explain some level of variance in PCA for example.

Is there a way to find out how many variances are explained by each component in PLS

Thanks in extended

+4


source to share


1 answer


I also have the same requirement for calculating the explanation of each component's explanation. I am new to pls, not native english, just accept my solution for reference.

Backgroud: If you choose "deflation_mode" as "regression" which is the default option. The estimated Y value can be calculated by this expression in "PLSRegression" [1]:

Y = TQ '+ Err

where T is x_scores_, Q is y_loadings_ This expression can provide a Y score from all major components. Therefore, if we want to know how many variances were explained using the first principal component, we could use the first vector x_scores_ and y_loadings_ to calculate the estimated Y1:

Y1 = T [0] Q [0] '+ Err



See below for Python code that calculates each component of R squared.

import numpy as np
from sklearn.cross_decomposition import PLSRegression
from sklearn.metrics import r2_score

pls = PLSRegression(n_components=3)
pls.fit(X,Y_true)
r2_sum = 0
for i in range(0,3):
        Y_pred=np.dot(pls.x_scores_[:,i].reshape(-1,1),pls.y_loadings_[:,i].reshape(-1,1).T)*naY.std(axis=0, ddof=1)+naY.mean(axis=0)
        r2_sum += round(r2_score(Y_true,Y_pred),3) 
        print('R2 for %d component: %g' %(i+1,round(r2_score(Y_true,Y_pred),3)))
print('R2 for all components (): %g' %r2_sum) #Sum of above
print('R2 for all components (): %g' %round(r2_score(Y_true,pls.predict(X)),3)) #Calcuted from PLSRegression 'predict' function.

      

Output:

R2 for 1 component: 0.633
R2 for 2 component: 0.221
R2 for 3 component: 0.104
R2 for all components: 0.958
R2 for all components: 0.958

      

[1] Pay attention to this expression. The jargon and meaning of "score", "weight" and "load" can be slightly different in different calculation methods.

+2


source







All Articles