Can numpy argsort handle ties?

I have a numpy array:

foo = array([3, 1, 4, 0, 1, 0])

      

I want the top 3 elements. Call

foo.argsort()[::-1][:3]

      

returns

array([2, 0, 4])

      

The values ​​of the notifications foo[1]

and foo[4]

are equal, so it numpy.argsort()

processes the relationship by returning the index of the element that appears last in the array; that is, index 4.

For my application, I cannot have a binding break always offset the end of the array, so how can I implement an accidental break? That is, half of the time I get it array([2, 0, 4])

, and the other half I get array([2, 0, 1])

.

+3


source to share


1 answer


Here's one approach:

Use numpy.unique

to sort the array and remove duplicate items. Pass the argument return_inverse

to get the indices in the sorted array that give the values ​​of the original array. Then you can get all the indices of the related elements by finding the inverse array indices whose values ​​are equal to the index into the unique array for that element.

For example:



foo = array([3, 1, 4, 0, 1, 0])
foo_unique, foo_inverse = unique(foo, return_inverse=True)

# Put largest items first
foo_unique = foo_unique[::-1]
foo_inverse = -foo_inverse + len(foo_unique) - 1

foo_top3 = foo_unique[:3]

# Get the indices into foo of the top item
first_indices = (foo_inverse == 0).nonzero()

# Choose one at random
first_random_idx = random.choice(first_indices)

second_indices = (foo_inverse == 1).nonzero()
second_random_idx = random.choice(second_indices)

# And so on...

      

numpy.unique

is implemented with argsort

, so looking at its implementation might suggest a simpler approach.

+3


source







All Articles