How to do in-place Cholesky factorization in Python

Question

How to do in-place Cholesky factorization in Python

I'm trying to do Cholesky factorization in place, but either this feature isn't actually implemented in scipy, or there is something I don't understand. I am posting here if this is the last one. Here's a simplified example of what I am doing:

import numpy
import scipy.linalg

numpy.random.seed(0)
X = numpy.random.normal(size=(10,4))
V = numpy.dot(X.transpose(),X)
R = V.copy()
scipy.linalg.cholesky(R,False,overwrite_a=True)
print V
print R

I think what must happen for R to be overwritten by the upper triangular matrix. However, when I run this code my V and R come out the same (choleski don't overwrite R). Am I not understanding the purpose of the rewriting, or am I making some other mistake? If this is just a bug in scipy, is there a workaround or other way I can do in-place Cholesky factorization in Python?

+3

python numpy scipy

jcrudy 18 jan. 13 at 22:46

source to share

3 answers

Try again,

>>> scipy.__version__
'0.11.0'
>>> np.__version__
'1.6.2'

it works fine:

X = np.random.normal(size=(10,4))
V = np.dot(X.transpose(),X)
print V
R = V.copy()
print R
C = scipy.linalg.cholesky(R,False,overwrite_a=True)
print V
print R

Output:

[[ 11.22274635   5.10611692   0.70263037   3.14603088] # V before
 [  5.10611692   8.94518939  -3.17865941   1.64689675]
 [  0.70263037  -3.17865941   7.35385131  -2.23948391]
 [  3.14603088   1.64689675  -2.23948391   8.25112653]]
[[ 11.22274635   5.10611692   0.70263037   3.14603088] # R before
 [  5.10611692   8.94518939  -3.17865941   1.64689675]
 [  0.70263037  -3.17865941   7.35385131  -2.23948391]
 [  3.14603088   1.64689675  -2.23948391   8.25112653]]
[[ 11.22274635   5.10611692   0.70263037   3.14603088] # V after
 [  5.10611692   8.94518939  -3.17865941   1.64689675]
 [  0.70263037  -3.17865941   7.35385131  -2.23948391]
 [  3.14603088   1.64689675  -2.23948391   8.25112653]]
[[ 3.35003677  1.52419728  0.20973811  0.93910339] # R after
 [ 0.          2.57332704 -1.35946252  0.08375069]
 [ 0.          0.          2.33703292 -0.99382158]
 [ 0.          0.          0.          2.52478036]]

+3

Jaime 18 jan. At 22:56

source to share

For completeness, the following code, based on Warren's comment, solves the problem:

import numpy
import scipy.linalg

numpy.random.seed(0)
X = numpy.random.normal(size=(10,4))
V = numpy.dot(X.transpose(),X)
R = V.copy('F')
print V.flags['C_CONTIGUOUS']
print R.flags['F_CONTIGUOUS']
scipy.linalg.cholesky(R,False,overwrite_a=True)
print V
print R

All I did was change the underlying R storage format from C order to Fortran order, which results in overwrite_a

behaving as expected.

If your matrix is really appreciated, you can do it without making a copy, by simply passing the transpose to cholesky. I don't think fixing the transposition will work if your matrix is complex valued, since the transposition of a complex Hermitian matrix is usually not equal to its conjugate transposition. However, if you make sure your matrix uses fortran order ( R.flags['F_CONTIGUOUS'] == True

) in the first place, then you shouldn't have a problem. But see Setting the default * default * data order (C vs. Fortran) in Numpy for a fix for this strategy.

0

jcrudy 22 jan. 13 at 22:03

source to share

Stefano M · Accepted Answer · 2013-01-19T12:02:18+0000

If you are brave enough, you can avoid scipy

and go for a low-level challenge.linalg.lapack_lite.dpotrf

import numpy as np

# prepare test data
A = np.random.normal(size=(10,10))
A = np.dot(A,A.T)
L = np.tril(A)

# actual in-place cholesky
assert L.dtype is np.dtype(np.float64)
assert L.flags['C_CONTIGUOUS']
n, m = L.shape
assert n == m
result = np.linalg.lapack_lite.dpotrf('U', n, L, n, 0)
assert result['info'] is 0

# check if L is the desired L cholesky factor
assert np.allclose(np.dot(L,L.T), A)
assert np.allclose(L, np.linalg.cholesky(A))

You need to understand lapack DPOTRF , fortran calling conventions, memory layout. Do not forget to check after leaving result['info'] == 0

. However, you are seeing it as just a line of code, and by discarding all health checks and copying done with linalg.cholesky

it, it might also be more efficient.

How to do in-place Cholesky factorization in Python

More articles: