Making point pin chunks without intermediates
I have a large set of 3 x 3 matrices ( n
of which, say) and corresponding 3 x 1 vectors, and would like to multiply each vector by its corresponding matrix. If I add matrices at n
x 3 x 3 ndarray
called R
, and vectors at 3 x n
ndarray
called v
, I can get a stack of multiplied vectors via
import numpy as np intermediate = np.dot(R, v) out = np.diagonal(intermediate, axis1=0, axis2=2)
But this is very inefficient: it np.dot
creates an n
x 3 x array n
intermediate
, from which I manually select a 3 x slice n
. Except for the loop following n
, can I somehow create a 3 x array n
without an intermediate n
x 3 x array n
?
source to share
Extending to the tip provided by @hpaulj: the described multiplication can be done by
out = np.einsum('ijk,ki->ji', R, v)
Speeding up over the approach in my question by 3 orders of magnitude (!) For n = 1000
:
%timeit d = np.diagonal(np.dot(R, v), axis1=0, axis2=2)
10 loops, best of 3: 27.8 ms per loop
%timeit o = np.einsum('ijk,ki->ji', R, v)
10000 loops, best of 3: 21.9 ยตs per loop
source to share