Element operations in mpmath are slower compared to numpy and its solution
I have some calculations that involve factorials that explode quickly, so I decided to use an arbitrary precision library mpmath
.
The code I have is as follows:
import numpy as np import mpmath as mp import time a = np.linspace( 0, 100e-2, 100 ) b = np.linspace( 0, np.pi ) c = np.arange( 30 ) t = time.time() M = np.ones( [ len(a), len(b), len(c) ] ) A, B = np.meshgrid( a, b, indexing = 'ij' ) temp = A**2 + B temp = np.reshape( temp, [ len(a), len(b), 1 ] ) temp = np.repeat( temp, len(c), axis = 2 ) M *= temp print 'part1: ', time.time() - t t = time.time() temp = np.array( [ mp.fac(x) for x in c ] ) temp = np.reshape( temp, [ 1, 1, len(c) ] ) temp = np.repeat( temp, len(a), axis = 0 ) temp = np.repeat( temp, len(b), axis = 1 ) print 'part2 so far:', time.time() - t M *= temp print 'part2 finally', time.time() - t t = time.time()
What seems to take the longest is the last line, and I suspect it is because it M
has a bunch float
and temp
has a bunch of mp.mpf
s. I tried to initialize M
with mp.mpf
, but then things got slower.
This is the result I am getting:
part1: 0.00429606437683
part2 so far: 0.00184297561646
part2 finally 1.9477159977
Any ideas how I can speed this up?
source to share
gmpy2
much faster than mpmath
this type of computation. The following code runs about 12x faster on my machine.
import numpy as np import gmpy2 as mp import time a = np.linspace(0, 100e-2, 100) b = np.linspace(0, np.pi) c = np.arange(30) t = time.time() M = np.ones([len(a), len(b), len(c)]) A, B = np.meshgrid( a, b, indexing = 'ij' ) temp = A**2+B temp = np.reshape(temp, [len(a), len(b), 1]) temp = np.repeat(temp, len(c), axis=2) M *= temp print 'part1:', time.time() - t t = time.time() temp = np.array([mp.factorial(x) for x in c]) temp = np.reshape(temp, [1, 1, len(c)]) temp = np.repeat(temp, len(a), axis=0) temp = np.repeat(temp, len(b), axis=1) print 'part2 so far:', time.time() - t M *= temp print 'part2:', time.time() - t t = time.time()
mpmath
is written in Python and typically uses Python's own numbers to compute it. If gmpy2
available, it will use the faster integer type provided gmpy2
. If you only need one of the functions provided directly by using gmpy2
, using gmpy2
directly is faster.
Update
I've done some experiments. What is actually going on may not be what you expect. When you calculate temp
values can either be an integer ( math.factorial
, gmpy.fac
, or gmpy2.fac
), or the value of a floating point ( gmpy2.factorial
, mpmath.fac
). When numpy
calculating M *= temp
, all values ββin are temp
converted to 64-bit float. If the value is an integer, the conversion raises an OverflowError. If the value is a floating point number, the conversion returns infinity. You can see this by changing c
to np.arange(300)
and typing M
at the end. If you use gmpy.fac
or math.factorial
, you get and OverflowError
. If you use mpmath.factorial
or gmpy2.factorial
, you will not receive OverflowError
, but the receivedM
will contain infinity.
If you are trying to avoid OverflowError
, you will need to compute temp
in floating point, so converting to a 64 bit float will result in infinity.
If you don't come across OverflowError
, then math.factorial
is the fastest option.
If you are trying to avoid both OverflowError
and infinity, you will need to use floating point types mpmath.mpf
or gmpy2.mpfr
. (Don't try to use gmpy.mpf
.)
Update # 2
Here's an example that uses gmpy2.mpfr
200 bit precision. With c=np.arange(30)
it ~ 5x faster than your original example. I am showing it using c = np.arange(300)
as it either generates OverflowError
or infinity. The total running time for the larger range is roughly the same as your original code.
import numpy as np import gmpy2 import time from gmpy2 import mpfr gmpy2.get_context().precision = 200 a = np.linspace(mpfr(0), mpfr(1), 100) b = np.linspace(mpfr(0), gmpy2.const_pi()) c = np.arange(300) t = time.time() M = np.ones([len(a), len(b), len(c)], dtype=object) A, B = np.meshgrid( a, b, indexing = 'ij' ) temp = A**2+B temp = np.reshape(temp, [len(a), len(b), 1]) temp = np.repeat(temp, len(c), axis=2) M *= temp print 'part1:', time.time() - t t = time.time() temp = np.array([gmpy2.factorial(x) for x in c], dtype=object) temp = np.reshape(temp, [1, 1, len(c)]) temp = np.repeat(temp, len(a), axis=0) temp = np.repeat(temp, len(b), axis=1) print 'part2 so far:', time.time() - t M *= temp print 'part2:', time.time() - t t = time.time()
Disclaimer: I support gmpy2
.
source to share