Differences in unique meaning: different results with Jama, PColt and NumPy
I want to perform singular value decomposition on a large (sparse) matrix. To select the best (most accurate) library, I tried to reproduce the SVD example provided here using different Java and Python libraries. Strangely, I get different results with each library.
Here's the original example matrix and its split (US and VT) matrices:
A =2.0 0.0 8.0 6.0 0.0 1.0 6.0 0.0 1.0 7.0 5.0 0.0 7.0 4.0 0.0 7.0 0.0 8.0 5.0 0.0 0.0 10.0 0.0 0.0 7.0 U =-0.54 0.07 0.82 -0.11 0.12 -0.10 -0.59 -0.11 -0.79 -0.06 -0.53 0.06 -0.21 0.12 -0.81 -0.65 0.07 -0.51 0.06 0.56 -0.06 -0.80 0.09 0.59 0.04 VT =-0.46 0.02 -0.87 -0.00 0.17 -0.07 -0.76 0.06 0.60 0.23 -0.74 0.10 0.28 0.22 -0.56 -0.48 0.03 0.40 -0.33 0.70 -0.07 -0.64 -0.04 -0.69 -0.32 S (with the top three singular values) = 17.92 0 0 0 15.17 0 0 0 3.56
I've tried using the following Java and Python libraries: Java: PColt, Jama Python: NumPy
Here are the results from each:
Jama:
U = 0.5423 -0.065 -0.8216 0.1057 -0.1245
0.1018 0.5935 0.1126 0.7881 0.0603
0.525 -0.0594 0.213 -0.1157 0.8137
0.6449 -0.0704 0.5087 -0.0599 -0.5628
0.0645 0.7969 -0.09 -0.5922 -0.0441
VT =0.4646 -0.0215 0.8685 8.0E-4 -0.1713
0.0701 0.76 -0.0631 -0.6013 -0.2278
0.7351 -0.0988 -0.284 -0.2235 0.565
0.4844 -0.0254 -0.3989 0.3327 -0.7035
0.065 0.6415 0.0443 0.6912 0.3233
S = 17.9184 0.0 0.0 0.0 0.0
0.0 15.1714 0.0 0.0 0.0
0.0 0.0 3.564 0.0 0.0
0.0 0.0 0.0 1.9842 0.0
0.0 0.0 0.0 0.0 0.3496
PColt:
U = -0.542255 0.0649957 0.821617 0.105747 -0.124490
-0.101812 -0.593461 -0.112552 0.788123 0.0602700
-0.524953 0.0593817 -0.212969 -0.115742 0.813724
-0.644870 0.0704063 -0.508744 -0.0599027 -0.562829
-0.0644952 -0.796930 0.0900097 -0.592195 -0.0441263
VT =-0.464617 0.0215065 -0.868509 0.000799554 -0.171349
-0.0700860 -0.759988 0.0630715 -0.601346 -0.227841
-0.735094 0.0987971 0.284009 -0.223485 0.565040
-0.484392 0.0254474 0.398866 0.332684 -0.703523
-0.0649698 -0.641520 -0.0442743 0.691201 0.323284
S =
(00) 17.91837085874625
(11) 15.17137188041607
(22) 3.5640020352605677
(33) 1.9842281528992616
(44) 0.3495556671751232
Numpy
U = -0.54225536 0.06499573 0.82161708 0.10574661 -0.12448979
-0.10181247 -0.59346055 -0.11255162 0.78812338 0.06026999
-0.52495325 0.05938171 -0.21296861 -0.11574223 0.81372354
-0.64487038 0.07040626 -0.50874368 -0.05990271 -0.56282918
-0.06449519 -0.79692967 0.09000966 -0.59219473 -0.04412631
VT =-4.64617e-01 2.15065e-02 -8.68508e-01 7.99553e-04 -1.71349e-01
-7.00859e-02 -7.59987e-01 6.30714e-02 -6.01345e-01 -2.27841e-01
-7.35093e-01 9.87971e-02 2.84008e-01 -2.23484e-01 5.65040e-01
-4.84391e-01 2.54473e-02 3.98865e-01 3.32683e-01 -7.03523e-01
-6.49698e-02 -6.41519e-01 -4.42743e-02 6.91201e-01 3.23283e-01
S = 17.91837086 15.17137188 3.56400204 1.98422815 0.34955567
As you can see, the sign of each element in the expanded Jama matrices (u and VT) is opposite to the sign in the original example. Interestingly, for PColt and Numpy, only the inverted signs of the elements in the last two columns are inverted. Is there any specific reason for inverted signs? Has anyone encountered similar discrepancies?
Here are the code snippets I used: Java
import java.text.DecimalFormat;
import cern.colt.matrix.tdouble.DoubleMatrix2D;
import cern.colt.matrix.tdouble.algo.DenseDoubleAlgebra;
import cern.colt.matrix.tdouble.algo.decomposition.DenseDoubleSingularValueDecomposition;
import cern.colt.matrix.tdouble.impl.DenseDoubleMatrix2D;
import Jama.Matrix;
import Jama.SingularValueDecomposition;
public class SVD_Test implements java.io.Serializable{
public static void main(String[] args)
{
double[][] data2 = new double[][]
{{ 2.0, 0.0, 8.0, 6.0, 0.0},
{ 1.0, 6.0, 0.0, 1.0, 7.0},
{ 5.0, 0.0, 7.0, 4.0, 0.0},
{ 7.0, 0.0, 8.0, 5.0, 0.0},
{ 0.0, 10.0, 0.0, 0.0, 7.0}};
DoubleMatrix2D pColt_matrix = new DenseDoubleMatrix2D(5,5);
pColt_matrix.assign(data2);
Matrix j = new Matrix(data2);
SingularValueDecomposition svd_jama = j.svd();
DenseDoubleSingularValueDecomposition svd_pColt = new DenseDoubleSingularValueDecomposition(pColt_matrix, true, true);
System.out.println("U:");
System.out.println("pColt:");
System.out.println(svd_pColt.getU());
printJamaMatrix(svd_jama.getU());
System.out.println("S:");
System.out.println("pColt:");
System.out.println(svd_pColt.getS());
printJamaMatrix(svd_jama.getS());
System.out.println("V:");
System.out.println("pColt:");
System.out.println(svd_pColt.getV());
printJamaMatrix(svd_jama.getV());
}
public static void printJamaMatrix(Matrix inp){
System.out.println("Jama: ");
System.out.println(String.valueOf(inp.getRowDimension())+" X "+String.valueOf(inp.getColumnDimension()));
DecimalFormat twoDForm = new DecimalFormat("#.####");
StringBuffer sb = new StringBuffer();
for (int r = 0; r < inp.getRowDimension(); r++) {
for (int c = 0; c < inp.getColumnDimension(); c++)
sb.append(Double.valueOf(twoDForm.format(inp.get(r, c)))).append("\t");
sb.append("\n");
}
System.out.println(sb.toString());
}
}
Python:
>>> import numpy
>>> numpy_matrix = numpy.array([[ 2.0, 0.0, 8.0, 6.0, 0.0],
[1.0, 6.0, 0.0, 1.0, 7.0],
[5.0, 0.0, 7.0, 4.0, 0.0],
[7.0, 0.0, 8.0, 5.0, 0.0],
[0.0, 10.0, 0.0, 0.0, 7.0]])
>>> u,s,v = numpy.linalg.svd(numpy_matrix, full_matrices=True)
Is there something wrong with the code? ...
source to share
Nothing bad: svd is not unambiguous until the sign of the U and V columns is changed (i.e. if you change the sign of the i-th column of U and the i-th column of V, you still have a valid svd: A = U * S * V ^ T). Different svd implementations will give slightly different results: in order to validate svd, you need to compute the norm (AU * S * V ^ T) / norm (A) and make sure it is a small number.
source to share
There is nothing bad. SVD resolves the column space and row space of the target matrix to orthonormal bases in such a way as to align the two spaces and allow for expansions along the eigenvectors. The alignment angles can be unique, discrete, or continuous, as shown below.
For example, given two angles t and p and a target matrix (see footnote)
A = ((1, -1), (2, 2))
General decomposition
U = ((0, exp [ip]), (-exp [it], 0))
S = sqrt (2) ((2,0), (0,1))
V * = (1 / sqrt (2)) ((exp [it], exp [it]), (exp [ip], -exp [ip]))
To recover the target matrix A = U S V *
Quick response quality test - block length checks each column in both U and V .
Footnote: Matrices in basic format. That is, the first row vector in matrix A is (1, -1).
- Finally, I have enough points to post the image file.
source to share