How to speed up slow matrix multiplication in SymPy?

Question

How to speed up slow matrix multiplication in SymPy?

I was writing a tool to solve specific recurrence equations with SymPy and found that one of the steps involved in matrix multiplication was extremely time consuming. For example, if I try the following in iPython console,

In [1]: from sympy import *

In [2]: A = Matrix(500, 500, lambda i,j: 2 + abs(i-j) if i-j in [-1, 0, 1] else 0)

In [3]: A[0:7, 0:7]
Out[3]: 
Matrix([
[2, 3, 0, 0, 0, 0, 0],
[3, 2, 3, 0, 0, 0, 0],
[0, 3, 2, 3, 0, 0, 0],
[0, 0, 3, 2, 3, 0, 0],
[0, 0, 0, 3, 2, 3, 0],
[0, 0, 0, 0, 3, 2, 3],
[0, 0, 0, 0, 0, 3, 2]])

In [4]: A * A
...

I never waited for him to finish.

I appreciate that symbolic manipulation is significantly slower than numerical evaluation, but it seems pretty ridiculous. Using Octave or other linear algebra packages can do this calculation in a split second.

Does anyone have any experience with Sympy matrix class for ~ 1000 rows and columns with Rational records?

+3

performance matrix multiplication sympy

mack 07 oct. 14 at 23:02

source to share

3 answers

Why not use nice sparse matrices instead of dense matrices? The matrices that arise when solving (linear) repetitions are usually sparse; one method gives you a matrix with 1 on the first overrange and zeros everywhere except the bottom row where the repetition rates go.

+1

tmyklebu 08 oct. 14 at 2:21

source to share

I have not used it sympy

for matrix operations, but I can reproduce the slowness you encountered with this code. It seems that the matrix operations in sympy are not that big.

I will recommend you to use numpy

which has great matrix operations and is very fast. Here is a copy of your code in numpy that does multiplication in less than 1 second on my laptop:

In [1]: import numpy as np

In [2]: A = np.matrix([[2 + abs(i-j) if i-j in [-1, 0, 1] else 0 for i in range(0, 500)] for j in range(0, 500)])

In [3]: A[0:7,0:7]
Out[3]:
matrix([[2, 3, 0, 0, 0, 0, 0],
        [3, 2, 3, 0, 0, 0, 0],
        [0, 3, 2, 3, 0, 0, 0],
        [0, 0, 3, 2, 3, 0, 0],
        [0, 0, 0, 3, 2, 3, 0],
        [0, 0, 0, 0, 3, 2, 3],
        [0, 0, 0, 0, 0, 3, 2]])

In [4]: A * A
Out[4]:
matrix([[13, 12,  9, ...,  0,  0,  0],
        [12, 22, 12, ...,  0,  0,  0],
        [ 9, 12, 22, ...,  0,  0,  0],
        ...,
        [ 0,  0,  0, ..., 22, 12,  9],
        [ 0,  0,  0, ..., 12, 22, 12],
        [ 0,  0,  0, ...,  9, 12, 13]])

+1

valo 08 oct. '14 at 7:48

source to share

user3666197 · Accepted Answer · 2014-10-08T02:15:37+0000

Arbitrary precision as mandatory, speed as pleasant to use

There are several pythonic classes for this purpose that may interest your arbitrary precision calculations.

decimal.Decimal()
fractions.Fraction( numerator = 0, denominator = 1 )

They are required for accurate computations, whether for large-scale astronomy, DEP modeling, or other fields where accuracy should not degrade along with advances in long-time / event / recursion computational strategies and similar threats to standard representation numbers.

Personally, I have worked with decimal.Decimal()

(up to 5,000,000 digits in the randomness analysis of a crypto sequence), but have not focused on getting the "em" inside . " numpy

Numpy can wrap these types in it matrix-dataStructures ( dtype = decimal.Decimal

syntax and others), however testing is required to check its vectorized function operations and overall speed after processing these fine pythonic instances.

Performance

As an initial observation of numpy speed on standard ("dense" sounds funny for this scale) 2x2 decimal.Decimal

-s:

>>> aClk.start();m*m;aClk.stop()
array([[Decimal('0.01524157875323883675019051999'),
        Decimal('5.502209507697009702374335655')],
       [Decimal('1.524157875323883675019051999'),
        Decimal('11.94939027587381419881628113')]], dtype=object)
5732L # 5.7 msec_______________________________________________________________

While the same did not dtype = numpy.float96

take

>>> aClk.start();f*f;aClk.stop()
array([[ 0.042788046,  0.74206772],
       [ 0.10081096,  0.46544855]], dtype=float96)
2979L # 2.9 msec_______________________________________________________________

For 500 x 500 fully filled dtype = fractions.Fraction

>>> aClk.start();M*M;aClk.stop()
array([[Fraction(9, 64), Fraction(1, 4), Fraction(64, 25), ...,
        Fraction(64, 81), Fraction(16, 81), Fraction(36, 1)],
        ..,
       [Fraction(1, 1), Fraction(9, 4), Fraction(4, 1), ...,
        Fraction(1, 4), Fraction(25, 36), Fraction(1, 1)]], dtype=object)
2692088L # 2.7 sec_<<<_Fraction_______________________________vs. 19 msec float96

For 500 x 500 fully filled dense dtype = decimal.Decimal

>>> aClk.start();D*D;aClk.stop()
array([[Decimal('0.140625'), Decimal('0.25'), Decimal('2.56'), ...,
        Decimal('0.7901234567901234567901234568'),
        Decimal('0.1975308641975308641975308642'), Decimal('36')],
       [Decimal('3.24'), Decimal('0.25'), Decimal('0.25'), ...,
        Decimal('0.02040816326530612244897959185'), Decimal('0.04'),
        Decimal('0.1111111111111111111111111111')],
       [Decimal('0.1111111111111111111111111111'), Decimal('0.25'),
        Decimal('2.25'), ..., Decimal('0.5102040816326530612244897959'),
        Decimal('0.25'), Decimal('0.0625')],
       ...,
       [Decimal('0'), Decimal('5.444444444444444444444444443'),
        Decimal('16'), ..., Decimal('25'), Decimal('0.81'), Decimal('0.04')],
       [Decimal('1'), Decimal('7.111111111111111111111111113'),
        Decimal('1'), ..., Decimal('0'), Decimal('81'), Decimal('2.25')],
       [Decimal('1'), Decimal('2.25'), Decimal('4'), ..., Decimal('0.25'),
        Decimal('0.6944444444444444444444444444'), Decimal('1')]], dtype=object)
4789338L # 4.8 sec_<<<_Decimal_______________________________vs. 19 msec float96
2692088L # 2.7 sec_<<<_Fraction______________________________vs. 19 msec float96

Do waits for a 1000x1000 tridiagonal matrix in less than 50ms?

since there are 3000 non-zero (sparse) elements, as opposed to the 250,000 cells in the fully filled 500x500 matrices tested above, there is a huge performance leap for using these pythonic classes for arbitrary precision calculations. The fraction has more space where the engine can take advantage of the design numerator

/ denominator

for MUL / DIV operations over the decimal, however exact performance envelopes must be tested in-vivo, in real-life cases your computational approach is used.

Sympy Syntax for `SparseMatrix`

and 1000x1000 Tri-Diagonals Tests

the real 1000x1000 test as SparseMatrix

suggested by @tmyklebu took a little longer to get the details due to installation issues, however may give you some insight into the real world implementation. Projects:

>>> F = sympy.SparseMatrix( 1000, 1000, { (0,0): 1} )        # .Fraction()
>>> D = sympy.SparseMatrix( 1000, 1000, { (0,0): 1} )        # .Decimal()

>>> for i in range( 1000 ):                                  # GEN to have F & D hold
...     for j in range( 1000 ):                              #     SAME values,
...         if i-j in [-1,0,1]:                              # but DIFF representations
...            num = int( 100 * numpy.random.random() )      #     
...            den = int( 100 * numpy.random.random() ) + 1  # + 1 to avoid DIV!0
...            F[i,j] = fractions.Fraction( numerator = num, denominator = den )
...            D[i,j] = decimal.Decimal( str( num ) ) / decimal.Decimal( str( den ) )

# called in Zig-Zag(F*F/D*D/F*F/D*D/...) order to avoid memory-access cache artifacts

>>> aClk.start();VOID=F*F;aClk.stop()
770353L                                      # notice the 1st eval took  TRIPLE LONGER
205585L                                      # notice the 2nd+
205364L # 0.205 sec_<<<_Fraction()____________________________vs. 0.331 sec Decimal()


>>> aClk.start();VOID=D*D;aClk.stop()
383137L # 0.383 sec_<<<_Decimal()____________________________vs. 0.770 sec 1st Fraction()
390164L # 0.390 sec_<<<_Decimal()____________________________vs. 0.205 sec 2nd Fraction()
331291L # 0.331 sec_<<<_Decimal()____________________________vs. 0.205 sec 3rd Fraction()

>>> F[0:4,0:4]
Matrix([
[ 1/52,  6/23,     0,     0],
[42/29, 29/12,     1,     0],
[    0, 57/88, 39/62, 13/57],
[    0,     0, 34/83, 26/95]])
>>> D[0:4,0:4]
Matrix([
[0.0192307692307692, 0.260869565217391,                 0,                 0],
[  1.44827586206897,  2.41666666666667,               1.0,                 0],
[                 0, 0.647727272727273, 0.629032258064516, 0.228070175438596],
[                 0,                 0, 0.409638554216867, 0.273684210526316]])

How to speed up slow matrix multiplication in SymPy?

Arbitrary precision as mandatory, speed as pleasant to use

Performance

Do waits for a 1000x1000 tridiagonal matrix in less than 50ms?

Sympy Syntax for SparseMatrix (adsbygoogle = window.adsbygoogle || []).push({}); and 1000x1000 Tri-Diagonals Tests

More articles:

Sympy Syntax for `SparseMatrix`

and 1000x1000 Tri-Diagonals Tests