Apply function along axis over two numpy arrays - shapes not aligned
I probably don't see anything obvious here, but don't believe that np.apply_along_axis
or np.apply_over_axes
is what I'm looking for. Let's say I have the following two arrays:
arr1 = np.random.randn(10, 5)
arr2 = np.random.randn(10, )
And the next function:
def coefs(x, y):
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
# the vector of coefficients in a multiple linear regression
Calling this on arr1
and arr2
runs smoothly as it should:
coefs(arr1, arr2)
Out[111]: array([-0.19474836, -0.50797551, 0.82903805, 0.06332607, -0.26985597])
However, let's say that instead of 1 or 2d arrays, I have two 3D arrays:
arr3 = np.array([arr1[:-1], arr1[1:]])
arr4 = np.array([arr2[:-1], arr2[1:]])
As expected, if I apply this function, I get
coefs(arr3, arr4)
Traceback (most recent call last):
File "<ipython-input-127-4a3e7df02cda>", line 1, in <module>
coefs(arr3, arr4)
File "<ipython-input-124-7532b8516784>", line 2, in coefs
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
ValueError: shapes (5,9,2) and (2,9,5) not aligned: 2 (dim 2) != 9 (dim 1)
... because NumPy treats each array as an object, as it should. Instead, I want to apply a function coefs()
to each of the two elements along the 0 axis of the arrays, element by element. Here's a rough way to do it:
tgt = []
for i, j in zip(arr3, arr4):
tgt.append(coefs(i, j))
np.array(tgt)
Out[136]:
array([[-0.34328006, -0.99116672, 1.42757897, -0.06687851, -0.44669182],
[ 0.44494495, -0.58017705, 0.75825944, 0.18795889, 0.4560851 ]])
My question is, is there a more efficient and pythonic way to do this than using zip and iteration as above? Basically, given two input arrays of shape (2, n, k) and (2, n), I want the array to fall back to shape (2, k). Thank.
source to share
For generic arrays 3D
and 2D
- arr3
and arr4
we can use some np.einsum
magic to have a vectorized solution, for example:
dot1 = np.einsum('ijk,ijl->ikl',arr3,arr3)
dot2 = np.einsum('ijk,ij->ik',arr3,arr4)
inv1 = np.linalg.inv(dot1)
tgt_out = np.einsum('ijk,ij->ik',inv1, dot2)
Runtime test
Approaches -
def org_app(arr3, arr4):
tgt = []
for i, j in zip(arr3, arr4):
tgt.append(coefs(i, j))
return np.array(tgt)
def einsum_app(arr3, arr4):
dot1 = np.einsum('ijk,ijl->ikl',arr3,arr3)
dot2 = np.einsum('ijk,ij->ik',arr3,arr4)
inv1 = np.linalg.inv(dot1)
return np.einsum('ijk,ij->ik',inv1, dot2)
Timing and verification -
In [215]: arr3 = np.random.rand(50,50,50)
...: arr4 = np.random.rand(50,50)
...:
In [216]: np.allclose(org_app(arr3, arr4), einsum_app(arr3, arr4))
Out[216]: True
In [217]: %timeit org_app(arr3, arr4)
100 loops, best of 3: 4.82 ms per loop
In [218]: %timeit einsum_app(arr3, arr4)
100 loops, best of 3: 19.7 ms per loop
Doesn't seem to einsum
give us any advantage here. This is expected because it mostly einsum
fights against np.dot
, which is much better when sum-reduction
and even if we use it in a loop. The only situation / case in which we can fight np.dot
is when we are sufficiently closed, and this should make us einsum
competitive. We loop for times equal to the length of the first axis of the input arrays. Let's increase it and check again -
In [219]: arr3 = np.random.rand(1000,10,10)
...: arr4 = np.random.rand(1000,10)
...:
In [220]: %timeit org_app(arr3, arr4)
10 loops, best of 3: 23 ms per loop
In [221]: %timeit einsum_app(arr3, arr4)
100 loops, best of 3: 9.1 ms per loop
einsum
definitely wins on this one!
This related post
in the fight between np.einsum
and is np.dot
worth seeing.
Also note that if we need to use a loop based approach, we have to look at initializing the output array and then assign the output values ββto it coefs
, not add as the latter is a slow process.
source to share