Efficient way to construct this array in numpy?
I need to build an array N x M x N
A
so that A[i, j, k] == (0 if i != k else x[j])
. I could write:
A = np.zeros((N, M, N))
for i in range(N):
for j in range(M):
A[i,j,i] = x[j]
Or alternatively:
A = np.zeros((N, M, N))
for i in range(N):
A[i,:,i] = x
But for my purposes they are most likely too slow. Is there a faster way?
source to share
Approach # 1
Using broadcasting
to create all those linear indexes where x
you need to assign and then just assign to its smoothed representation, so is -
# Initialize
Aout = np.zeros((N, M, N))
# Comput all linear indices
idx = (np.arange(N)*(N*M+1))[:,None] + N*np.arange(M)
# In a flattened view with `.ravel()` assign from x
Aout.ravel()[idx] = x
Approach # 2
Using data access based on the views supported np.lib.stride_tricks.as_strided
-
Aout = np.zeros((N, M, N))
s0,s1,s2 = Aout.strides
Aoutview = np.lib.stride_tricks.as_strided(Aout,shape=(N,M),strides=(s0+s2,s1))
Aoutview[:] = x
Approach # 3
Another approach would be to use integer array indexing
along the first and third axes, carefully modeling the second approach from the question, but in vector form -
Aout = np.zeros((N, M, N))
Aout[np.arange(N),:,np.arange(N)] = x
Runtime test
Approaches -
def app0(x,A):
for i in range(N):
for j in range(M):
A[i,j,i] = x[j]
return A
def app1(x,A):
for i in range(N):
A[i,:,i] = x
return A
def app2(x,Aout):
idx = (np.arange(N)*(N*M+1))[:,None] + N*np.arange(M)
Aout.ravel()[idx] = x
return Aout
def app3(x,Aout):
s0,s1,s2 = Aout.strides
Aoutview = np.lib.index_tricks.as_strided(Aout,shape=(N,M),strides=(s0+s2,s1))
Aoutview[:] = x
return Aout
def app4(x,Aout):
r = np.arange(N)
Aout[r,:,r] = x
return Aout
Checking -
In [125]: # Params
...: N, M = 100,100
...: x = np.random.rand(M)
...:
...: # Make copies of arrays to be assigned into
...: A0 = np.zeros((N, M, N))
...: A1 = np.zeros((N, M, N))
...: A2 = np.zeros((N, M, N))
...: A3 = np.zeros((N, M, N))
...: A4 = np.zeros((N, M, N))
...:
In [126]: print np.allclose(app0(x,A0), app1(x,A1))
...: print np.allclose(app0(x,A0), app2(x,A2))
...: print np.allclose(app0(x,A0), app3(x,A3))
...: print np.allclose(app0(x,A0), app4(x,A4))
...:
True
True
True
True
Timing -
In [127]: # Make copies of arrays to be assigned into
...: A0 = np.zeros((N, M, N))
...: A1 = np.zeros((N, M, N))
...: A2 = np.zeros((N, M, N))
...: A3 = np.zeros((N, M, N))
...: A4 = np.zeros((N, M, N))
In [128]: %timeit app0(x,A0)
...: %timeit app1(x,A1)
...: %timeit app2(x,A2)
...: %timeit app3(x,A3)
...: %timeit app4(x,A4)
...:
1000 loops, best of 3: 1.49 ms per loop
10000 loops, best of 3: 53.6 ยตs per loop
10000 loops, best of 3: 150 ยตs per loop
10000 loops, best of 3: 28.6 ยตs per loop
10000 loops, best of 3: 25.2 ยตs per loop
source to share