How do I find indices in a numpy array that satisfy multiple conditions?

I have an array in Python, for example:

Example:

>>> scores = numpy.asarray([[8,5,6,2], [9,4,1,4], [2,5,3,8]])
>>> scores
array([[8, 5, 6, 2],
   [9, 4, 1, 4],
   [2, 5, 3, 8]])

      

I want to find all indices [row, col]

in scores

, where the value is:

1) the minimum in its line

2) more than a threshold

3) no more than 8 times the next value in the line

I would like to do this as efficiently as possible, preferably without any loops. I've been struggling with this for a while, so any help you can provide would be greatly appreciated!

+3


source to share


2 answers


He must walk something along the lines

In [1]: scores = np.array([[8,5,6,2], [9,4,1,4], [2,5,3,8]]); threshold = 1.1; scores
Out[1]: 
array([[8, 5, 6, 2],
       [9, 4, 1, 4],
       [2, 5, 3, 8]])

In [2]: part = np.partition(scores, 2, axis=1); part
Out[2]: 
array([[2, 5, 6, 8],
       [1, 4, 4, 9],
       [2, 3, 5, 8]])

In [3]: row_mask = (part[:,0] > threshold) & (part[:,0] <= 0.8 * part[:,1]); row_mask
Out[3]: array([ True, False,  True], dtype=bool)

In [4]: rows = row_mask.nonzero()[0]; rows
Out[4]: array([0, 2])

In [5]: cols = np.argmin(scores[row_mask], axis=1); cols
Out[5]: array([3, 0])

      

At that point, if you are looking for the actual coordinate pairs, you can simply zip

:



In [6]: coords = zip(rows, cols); coords
Out[6]: [(0, 3), (2, 0)]

      

Or, if you plan to view these items, you can use them as they are:

In [7]: scores[rows, cols]
Out[7]: array([2, 2])

      

+2


source


I think you will find it difficult to do this without any loops (or at least something that does a loop like this, but might mask it as something else), seeing how the operation only depends on in the line and you want to do this for every line. This is not the most efficient (and which may depend on how often conditions 2 and 3 are true), but this will work:

import heapq
threshold = 1.5
ratio = .8
scores = numpy.asarray([[8,5,6,2], [9,4,1,4], [2,5,3,8]])

found_points = []
for i,row in enumerate(scores):
    lowest,second_lowest = heapq.nsmallest(2,row)
    if lowest > threshold and lowest <= ratio*second_lowest:
        found_points.append([i,numpy.where(row == lowest)[0][0]])

      



You will get (for example):

found_points = [[0, 3], [2, 0]]

      

+1


source







All Articles