What is the difference between PCA, TruncatedSVD and ICA?
1 answer
Doing this process will require long pages of PDF document :-).
But the idea is simple:
- Principal component analysis (PCA) - Analysis of the native coordinates of the data. Namely, coordinated, which according to the data has the highest energy (spread). For n samples of dimension d, there will be orthogonal directions $ d $. Namely, the data projected onto them has no correlation. If we treat the data as random variables, this means that we have found a coordinate system in which the cross-correlation (first moment) of any pair from the projected data disappears.
It is a very efficient way to approximate data in a lower dimension while conserving most of its energy. - Truncated SVD. It can be shown that one way of calculating this coordinate system is to use SVD. Hence, it is a method of applying the ideas behind the PCA.
- Independent Component Analysis (ICA) is a step further from PCA. While in the PCA we only looked at the first order moments of the data (Correlation) in the ICA, we look at the higher moments and try to find a projection of the data that disappears at the higher moments ("Think of lack of correlation versus independence of probability").
+2
source to share