Find n smallest elements in an array of numpy arrays

There are many questions here where to find the nth smallest element in a numpy array. However, what if you have an array of arrays? For example:

>>> print matrix
[[ 1.          0.28958002  0.09972488 ...,  0.46999924  0.64723113
   0.60217694]
 [ 0.28958002  1.          0.58005657 ...,  0.37668355  0.48852272
   0.3860152 ]
 [ 0.09972488  0.58005657  1.         ...,  0.13151364  0.29539992
   0.03686381]
 ..., 
 [ 0.46999924  0.37668355  0.13151364 ...,  1.          0.50250212
   0.73128971]
 [ 0.64723113  0.48852272  0.29539992 ...,  0.50250212  1.          0.71249226]
 [ 0.60217694  0.3860152   0.03686381 ...,  0.73128971  0.71249226  1.        ]]

      

How can I get the n smallest elements from this array of arrays?

>>> print type(matrix)
<type 'numpy.ndarray'>

      

This is how I did it to find the coordinates of the smallest element:

min_cordinates = []
for i in matrix:
    if numpy.any(numpy.where(i==numpy.amin(matrix))[0]):
        min_cordinates.append(int(numpy.where(i==numpy.amin(matrix))[0][0])+1)

      

Now I would like to find, for example, the 10 smallest items.

+3


source to share


3 answers


Flatten the matrix, sort and then select the first 10.



print(numpy.sort(matrix.flatten())[:10])

      

+5


source


If your array is small, the accepted answer is fine. It np.partition

will be much more efficient for large arrays . Here's an example where an array contains 10,000 elements and you want the smallest 10 values:

In [56]: np.random.seed(123)

In [57]: a = 10*np.random.rand(100, 100)

      

Use np.partition

to get 10 smallest values:

In [58]: np.partition(a, 10, axis=None)[:10]
Out[58]: 
array([ 0.00067838,  0.00081888,  0.00124711,  0.00120101,  0.00135942,
        0.00271129,  0.00297489,  0.00489126,  0.00556923,  0.00594738])

      

Note that the values ​​are not in ascending order. np.partition

does not guarantee that the first 10 values ​​will be sorted. If you need them in ascending order, you can sort the selected values ​​later. It will still be faster than sorting the entire array.



Here's the result using np.sort

:

In [59]: np.sort(a, axis=None)[:10]
Out[59]: 
array([ 0.00067838,  0.00081888,  0.00120101,  0.00124711,  0.00135942,
        0.00271129,  0.00297489,  0.00489126,  0.00556923,  0.00594738])

      

Now compare the times:

In [60]: %timeit np.partition(a, 10, axis=None)[:10]
10000 loops, best of 3: 75.1 µs per loop

In [61]: %timeit np.sort(a, axis=None)[:10]
1000 loops, best of 3: 465 µs per loop

      

In this case, use is np.partition

more than six times faster.

+5


source


You can use a function to return a list of the 10 smallest items. heapq.nsmallest

In [84]: import heapq

In [85]: heapq.nsmallest(10, matrix.flatten())
Out[85]: 
[-1.7009047695355393,
 -1.4737632239971061,
 -1.1246243781838825,
 -0.7862983016935523,
 -0.5080863016259798,
 -0.43802651199959347,
 -0.22125698200832566,
 0.034938408281615596,
 0.13610084041121048,
 0.15876389111565958]

      

+3


source







All Articles