Save specific component to PCA

Question

Save specific component to PCA

I have a numpy array called "data" that has 500 rows and 500 columns. Using the PCA from sklearn I can compress this down to 500 rows and 15 columns. I reckon that I essentially go from 500 axes and 500 points to 15 axes and 500 points. The axes are all orthogonal and explain my data very well.

But I want to know if there is anyway to guarantee that one of the 15 axes (which I get after starting the PCA) is also one of the original 500. That is, can I keep one of the original axes and use the PCA (or some or another method) to find the remaining 14?

My code is below:

from sklearn.decomposition import PCA
#data is some 500x500 numpy array
pca = PCA(n_components = 15)
pca_result = pca.fit_transform(data)
#pca_result is a 500x15 numpy array

+3

python python-2.7 numpy scikit-learn pca

Abhinav Ramakrishnan June 21. 15 at 2:29

source to share

2 answers

I think you are trying to do the least squares linear snapping first to the axis you want to keep:

axis_to_keep = data[:,column_number][:,np.newaxis]
# next line solves axis_to_keep*x = data
x = np.linalg.lstsq(axis_to_keep,data)[0]

Then subtract the fit generated using that model from data

:

data_2 = data - np.dot(axis_to_keep,x)

at this point you can make your PCA data_2

with 14 components. Your forced axis will (almost certainly) not be orthogonal to the others.

0

DavidW June 22. 15 at 7:39

source to share

Andreas Mueller · Accepted Answer · 2015-06-22T15:05:51+0000

You can simply omit the axis you want to keep from the data:

mask = np.ones(data.shape[1], dtype=np.bool)
mask[special_axis] = False
data_new = data[:, mask]

pca_transformed = PCA(n_components=14).fit_transform(data_new)

This is the same as deleting the projection along this function. Then you can add the original axis with the PCA result if you like:

stacked_result = np.hstack([pca_transformed, data[:, [special_axis]]])

Save specific component to PCA

More articles: