Can numpy argsort handle ties?
I have a numpy array:
foo = array([3, 1, 4, 0, 1, 0])
I want the top 3 elements. Call
foo.argsort()[::-1][:3]
returns
array([2, 0, 4])
The values of the notifications foo[1]
and foo[4]
are equal, so it numpy.argsort()
processes the relationship by returning the index of the element that appears last in the array; that is, index 4.
For my application, I cannot have a binding break always offset the end of the array, so how can I implement an accidental break? That is, half of the time I get it array([2, 0, 4])
, and the other half I get array([2, 0, 1])
.
source to share
Here's one approach:
Use numpy.unique
to sort the array and remove duplicate items. Pass the argument return_inverse
to get the indices in the sorted array that give the values of the original array. Then you can get all the indices of the related elements by finding the inverse array indices whose values are equal to the index into the unique array for that element.
For example:
foo = array([3, 1, 4, 0, 1, 0])
foo_unique, foo_inverse = unique(foo, return_inverse=True)
# Put largest items first
foo_unique = foo_unique[::-1]
foo_inverse = -foo_inverse + len(foo_unique) - 1
foo_top3 = foo_unique[:3]
# Get the indices into foo of the top item
first_indices = (foo_inverse == 0).nonzero()
# Choose one at random
first_random_idx = random.choice(first_indices)
second_indices = (foo_inverse == 1).nonzero()
second_random_idx = random.choice(second_indices)
# And so on...
numpy.unique
is implemented with argsort
, so looking at its implementation might suggest a simpler approach.
source to share