What's the best way to shrink a numpy array?
I have a numpy 3D array with an Nx64x64 shape. I would like to shrink it in size 1 and 2 by taking the average, resulting in a new array with the shape of Nx8x8.
I have several working implementations, but I feel like there should be a neater way to do this.
I first tried using np.split:
def subsample(inparray, n):
inp = inparray.copy()
res = np.moveaxis(np.array(np.hsplit(inp, inp.shape[1]/n)), 1, 0)
res = np.moveaxis(np.array(np.split(res, inp.shape[2]/n, axis=3)), 1, 0)
res = np.mean(res, axis=(3,4))
return res
I also tried using regular indexing:
def subsample2(inparray, n):
res = np.zeros((inparray.shape[0], n, n))
lin = np.linspace(0, inparray.shape[1], n+1).astype(int)
bounds = np.stack((lin[:-1], lin[1:]), axis=-1)
for i, b in enumerate(bounds):
for j, b2 in enumerate(bounds):
res[:, i, j] = np.mean(inparray[:, b[0]:b[1], b2[0]:b2[1]], axis=(1,2))
return res
I wondered about using itertools.groupby, but it also looked pretty involved.
Does anyone know of a clean solution?
source to share
Change the division of the last two axes to two so that the last split have lengths equal to the block sizes, giving us an array 5D
, then use mean
along the third and fifth axes -
BSZ = (8,8) m,n = a.shape[1:] out = a.reshape(N,m//BSZ[0],BSZ[0],n//BSZ[1],BSZ[1]).mean(axis=(2,4))
An example of running on a smaller array with a smaller block size (2,2)
-
1) Inputs:
In [271]: N = 2
In [272]: a = np.random.randint(0,9,(N,6,6))
In [273]: a
Out[273]:
array([[[3, 1, 8, 7, 8, 2],
[0, 6, 2, 6, 8, 2],
[2, 1, 1, 0, 0, 1],
[8, 3, 0, 2, 8, 0],
[4, 7, 2, 6, 6, 7],
[5, 5, 1, 7, 2, 7]],
[[0, 0, 8, 1, 7, 6],
[8, 6, 5, 8, 4, 0],
[0, 3, 7, 7, 6, 1],
[7, 1, 7, 6, 3, 6],
[7, 6, 4, 6, 4, 5],
[4, 2, 0, 2, 6, 2]]])
2) Get multiple outputs for manual verification:
In [274]: a[0,:2,:2].mean()
Out[274]: 2.5
In [275]: a[0,:2,2:4].mean()
Out[275]: 5.75
In [276]: a[0,:2,4:6].mean()
Out[276]: 5.0
In [277]: a[0,2:4,:2].mean()
Out[277]: 3.5
3) Use the suggested approach and manually check:
In [278]: BSZ = (2,2)
In [279]: m,n = a.shape[1:]
In [280]: a.reshape(N,m//BSZ[0],BSZ[0],n//BSZ[1],BSZ[1]).mean(axis=(2,4))
Out[280]:
array([[[ 2.5 , 5.75, 5. ],
[ 3.5 , 0.75, 2.25],
[ 5.25, 4. , 5.5 ]],
[[ 3.5 , 5.5 , 4.25],
[ 2.75, 6.75, 4. ],
[ 4.75, 3. , 4.25]]])
source to share