Numerical multiplication of vectors of different sizes for loops

I have a matrix of, say, P

size (X,Y)

. Also, I have two matrices of, say, Kx

and Ky

size (M,N)

and a matrix of pk

size (M,N)

and two vectors u

and v

from X

and Y

respectively. For example, they can be defined as follows:

import numpy as np
P = np.zeros((X,Y));
pk = np.random.rand(M,N);
Kx = np.random.rand(M,N);
Ky = np.random.rand(M,N);
u = np.random.rand(X);
v = np.random.rand(Y);

      

In real code, they are certainly not random, but that doesn't matter for this example. The question is whether there is a pure number equivalent to the following:

for m in range(0, M):
    for n in range(0, N):
        for i in range(0,X):
            for j in range(0,Y):
               Arg = Kx[m,n]*u[i] + Ky[m,n]*v[j];
               P[i,j] += pk[m,n]*np.cos(Arg);

      

All M

, N

, X

, Y

different, but X

and Y

may be the same, if there is no other solution.

+3


source to share


1 answer


A common elimination strategy for-loop

in NumPy computation is to work with multidimensional arrays.

Consider, for example, the line

Arg = Kx[m,n]*u[i] + Ky[m,n]*v[j]

      

This line depends on the indices m

, n

, i

and j

. Therefore, Arg

it depends on the m

, n

, i

, and j

. This means that Arg

you can be considered as a 4-dimensional array, indexed m

, n

, i

and j

. So we can eliminate 4 for-loops

- as far as it goes Arg

- by calculating

Kxu = Kx[:,:,np.newaxis]*u
Kyv = Ky[:,:,np.newaxis]*v
Arg = Kxu[:,:,:,np.newaxis] + Kyv[:,:,np.newaxis,:]

      

Kx[:,:,np.newaxis]

has a shape (M, N, 1)

, but u

has a shape (X,)

. By multiplying them together, NumPy broadcasting creates an array of the shape (M, N, X)

. Thus, the higher the new axes are used more as placeholders so that Arg

ends four axes, indexed m

, n

, i

, j

in that order.

Similarly, P

one can define as

P = (pk[:,:,np.newaxis,np.newaxis]*np.cos(Arg)).sum(axis=0).sum(axis=0)

      

sum(axis=0)

(called twice) is summed along the axes m

and n

, so it P

ends up with a two-dimensional array indexed only by i

and j

.



By working with these 4-dimensional arrays, we are able to apply NumPy operations to entire NumPy arrays. On the contrary, when using 4, for-loops

we had to perform calculations on the value over scalars. Consider, for example, what it does np.cos(Arg)

when it Arg

is a 4-dimensional array. This disables the computation of all cosines in a single call to the NumPy function, which executes a basic loop in the compiled C code. This is much faster than calling np.cos

once for each scalar. It is for this reason that working with more massive arrays ends up so much faster than the code for-loop

.


import numpy as np

def orig(Kx, Ky, u, v, pk):
    M, N = Kx.shape
    X = u.size
    Y = v.size
    P = np.empty((X, Y), dtype=pk.dtype)
    for m in range(0, M):
        for n in range(0, N):
            for i in range(0,X):
                for j in range(0,Y):
                   Arg = Kx[m,n]*u[i] + Ky[m,n]*v[j]
                   P[i,j] += pk[m,n]*np.cos(Arg)
    return P

def alt(Kx, Ky, u, v, pk):
    Kxu = Kx[:,:,np.newaxis]*u
    Kyv = Ky[:,:,np.newaxis]*v
    Arg = Kxu[:,:,:,np.newaxis] + Kyv[:,:,np.newaxis,:]
    P = (pk[:,:,np.newaxis,np.newaxis]*np.cos(Arg)).sum(axis=0).sum(axis=0)
    return P

M, N = 10, 20
X, Y = 5, 15
Kx = np.random.random((M, N))
Ky = np.random.random((M, N))
u = np.random.random(X)
v = np.random.random(Y)
pk = np.random.random((M, N))

      


Health check (showing that alt and orig return the same result):

In [57]: P2 = alt(Kx, Ky, u, v, pk)

In [58]: P1 = orig(Kx, Ky, u, v, pk)

In [59]: np.allclose(P1, P2)
Out[59]: True

      


Test showing alt

is significantly faster than orig

:

In [60]: %timeit orig(Kx, Ky, u, v, pk)
10 loops, best of 3: 33.6 ms per loop

In [61]: %timeit alt(Kx, Ky, u, v, pk)
1000 loops, best of 3: 349 µs per loop

      

+6


source







All Articles