Is there a way to distinguish between values ​​when using operator.itemgetter () as the sort key?

I have a list of lists containing strings represented by numbers:

nums = [['1','3'],['2','2'],['1','2'],['0','2'],['11','2']]

      

I need to sort them in ascending order by the first and second records without changing the string representation of the numbers in the original list. Also, you want to avoid creating a second copy of the list with everything that maps to integers - imagine it's a huge list.

Both sort()

and sorted()

work fine with tuples and lists, so with the lambda key I can do the following:

>>> sorted(nums, key=lambda n: (int(n[0]),int(n[1])) 
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]

      

Happy Days...

However, I've seen several posts suggesting sorting is faster using a key function operator.itemgetter()

using lambda. Without resorting to a discussion of the validity of these claims , could anyone, if possible, apply a string-to-integer conversion for comparison when using operator.itemgetter()

:

The following obviously fail, since strings are compared as strings, not numbers:

>>> sorted(nums, key=operator.itemgetter(0,1)) 
[['0', '2'], ['1', '2'], ['1', '3'], ['11', '2'], ['2', '2']]

      

+3


source to share


2 answers


There are ways, for example using 1 and :iteration_utilities.chained

functools.partial

>>> import operator import itemgetter
>>> from iteration_utilities import chained
>>> from functools import partial

>>> itemgetter_int = chained(operator.itemgetter(0, 1), partial(map, int), tuple)
>>> sorted(nums, key=itemgetter_int)
[['0', '2'], ['1', '2'], ['1', '3'], ['2', '2'], ['11', '2']]

      

It works, but it is definitely slower than using lambda

or a custom function.



If you really want speed, you can cythonize the lambda

function (or write it in C manually), but if you just need it in one place, just use throw-away lambda

. Especially if it is in sorted

, because it has O(nlog(n))

comparisons, so function calls O(n)

probably have little impact on overall execution time.


1: This is a feature in a 3rd party addon I created. It must be installed separately, for example via conda

or pip

.

+3


source


operator.itemgetter

works not because it does something special in sort

, but because it is written entirely in c and does not require a python function call.

So you're looking for a C function that does what you want itemgetter

- it's a red herring.



In python 2, you can avoid calling pure-python functions with key=functools.partial(map, int)

. This won't work in python 3 because it map

no longer returns a list or tuple. It also might not be faster than your solution.

+4


source







All Articles