NumPy tensordot MemoryError

I have two matrices: A - 3033x3033, X - 3033x20. I am using the following lines (as suggested in the answer to another question I asked ):

n, d = X.shape
c = X.reshape(n, -1, d) - X.reshape(-1, n, d)
return np.tensordot(A.reshape(n, n, -1) * c, c, axes=[(0,1),(0,1)])

      

On the final line, Python just stops and says "MemoryError". How can I get around this, either by changing some settings in Python, or by making it more memory efficient?

+3


source to share


3 answers


Here is a function that computes without any loops and without a large temporary array. See the linked question for a longer answer, complete with a test script.

def fbest(A, X):
  ""
  KA_best = np.tensordot(A.sum(1)[:,None] * X, X, axes=[(0,), (0,)])
  KA_best += np.tensordot(A.sum(0)[:,None] * X, X, axes=[(0,), (0,)])
  KA_best -= np.tensordot(np.dot(A, X), X, axes=[(0,), (0,)])
  KA_best -= np.tensordot(X, np.dot(A, X), axes=[(0,), (0,)])
  return KA_best

      

I profiled the code using size arrays: enter image description here



I love sp.einsum by the way. This is a great place to start when speeding up array operations by deleting for loops. You can do a lot of SOOOO with a single call to sp.einsum.

The advantage of np.tensordot is that it links to whatever swift library you have installed (i.e. MKL). This way tensordot will run faster and in parallel when you have the required libraries installed.

+3


source


If you replace the final line with

return np.einsum('ij,ijk,ijl->kl',A,c,c)

      

you are avoiding creating an intermediate A.reshape(n, n, -1) * c

(3301 by 3301 by 20), which is your main problem in my opinion.



My impression is that the version I give is most likely slower (for cases where it is not exhausted), but I have not underestimated it.

Perhaps you could go ahead and not create c

, but I can't immediately see how to do this. This will be a case of later writing it all in terms of matrix pointer sums and understanding what makes it easier.

+2


source


You can use a two nested loop format that repeats for the last size X

. This is the last dimension now 20

, so hopefully it will still be efficient enough and, more importantly, leave a minimum amount of memory. Here's the implementation -

n, d = X.shape
c = X.reshape(n, -1, d) - X.reshape(-1, n, d)
out = np.empty((d,d))  # d is a small number: 20
for i in range(d):
    for j in range(d):
        out[i,j] = (A*c[:,:,i]*(c[:,:,j])).sum()

 return out

      

You can replace the last line with np.einsum

-

out[i,j] = np.einsum('ij->',A*c[:,:,i]*c[:,:,j])    

      

0


source







All Articles