Pythonic way to extract 2D array from list

Let's say I have a list that contains 16 items:

lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']

      

This list is a 4 x 4 array with all the elements in a 1D list. In array form, it has the following form:

'A', 'B', 'C', 'D'
'E', 'F', 'G', 'H'
'I', 'J', 'K', 'L'
'M', 'N', 'O', 'P' 

      

I want to extract a submatrix from this 1D list as another 1D list that always starts with the first element.

eg. Extracting a 2 x 2 matrix from lst:

'A', 'B', 'E', 'F'

      

Or extracting a 3 x 3 matrix from lst:

'A', 'B', 'C', 'E', 'F', 'G', 'I', 'J', 'K'

      

To do this, I use numpy to resize a list in an array, extract the submatrix, and then flatten again:

import numpy as np

# The size of the matrix represented by lst
init_mat = 4
# Desired matrix size to extract
mat_size = 2
A = np.resize(lst,(init_mat,init_mat))
B = A[0:mat_size, 0:mat_size].flatten()
C = map(str,B)

      

This works, but I was wondering if there is a more pythonic way to do this, since I don't think this method will scale well with the size of the matrix.

+3


source to share


3 answers


One array based approach would be -

size = 2 # or 3 or any number <= 4
np.asarray(lst).reshape(4,4)[:size,:size].ravel()

      

Example run -

In [55]: lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']

In [56]: size=2

In [57]: np.asarray(lst).reshape(4,4)[:size,:size].ravel()
Out[57]: 
array(['A', 'B', 'E', 'F'],
      dtype='|S1')

In [58]: size=3

In [59]: np.asarray(lst).reshape(4,4)[:size,:size].ravel()
Out[59]: 
array(['A', 'B', 'C', 'E', 'F', 'G', 'I', 'J', 'K'],
      dtype='|S1')

      

If you want an array 2D

, skip this part ravel()

.



If you want to get the list as output, we need an extra step .tolist()

that is added to the output.


If you want to avoid converting the entire list to an array, perhaps because the number of elements is too large and the window to retrieve is relatively smaller, we can simply generate valid indices for the block with some help from NumPy broadcasting

. Then index into the input list with it for final list output. Thus, we get something like this -

idx = (np.arange(size)[:,None]*4 + np.arange(size)).ravel()
out = [lst[i] for i in idx]

      

+2


source


The call flatten()

is then map()

less efficient than:

B = A[:mat_size, :mat_size].reshape(-1)
C = B.tolist()

      

This avoids some copies and unnecessary function calls.



For more details on reshape()

vs flatten()

see below: What is the difference between flatten and ravel functions in numpy?

You can also do this without NumPy. It's kind of easier. You will need to test your specific input to see which is faster.

[lst[i*init_mat + j] for i in range(mat_size) for j in range(mat_size)]

      

+3


source


Doing this without numpy and considering when the matrix gets large, I would use an iterator to walk through the list, so no extra lists are generated during extraction. Using islice

to extract the required elements, he cuts out the elements required for each cutting operation. In the case of retrieving a 3x3 matrix, the first slice starts at index 0 and stops before index 3, thus knocking out the first three elements from the iterator. The next chunks start at index 1 because 4 - 3 = 1 and stop at 4.

from itertools import chain, islice, repeat

lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']

width = 4
extract = [3, 3]

slice_starts = chain([0], repeat(width - extract[0]))
slice_stops = chain([extract[0]], repeat(width))
rows = map(islice, repeat(iter(lst), extract[1]), slice_starts, slice_stops)
print(list(chain.from_iterable(rows)))

      

Or you can take the first three items from every 4 items using compress

from itertools import chain, compress, repeat

lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']

width = 4
extract = [3, 3]
selectors = repeat([i < extract[0] for i in range(width)], extract[1])
print(list(compress(lst, chain.from_iterable(selectors))))

      

+1


source







All Articles