Python descending in text file

At the moment I am sorting my text file in descending order like:

import operator
fo = open('3a.txt','r')
x = fo.readlines()
sorted_x = sorted(x, key=operator.itemgetter(0),reverse = True)
print(sorted_x)

      

My text files look like this:

5 Helen 
4 judy
8 Rachel

      

I was wondering how would I be able to use operator.itemgetter(0)

with double digits?

When I use this with:

5 Helen 
4 judy
25 Hanna
11 Elsa
8 Rachel

      

All results are wrong:

['8 Rachel', '5 Helen', '4 judy', '25 Hanna', '11 Elsa']

      

even if i use operator.itemgetter(0, 1)

.

+3


source to share


1 answer


Your approach operator.itemgetter(0)

can only get you through; it will always select the first character of your lines.

To sort the file correctly, your function key

will need to do a few things:

  • Separate the number at the beginning of the line from the rest.
  • Turn the string representing a number into a valid integer value.

The latter is important because the sorting of strings is done lexicographically; for example '10'

sorted before '9'

because it 1

precedes 9

.

To support arbitrary starting of numeric characters at the beginning of a line, separated by spaces, you need to split by spaces. A method can do this for you if you don't give it any arguments or use it for the first argument. To keep efficiency, limit the number of sections to 1 and turn the first element of the result into an integer: str.split()

None

fo = open('3a.txt','r')
x = fo.readlines()
sort_key = lambda line: int(line.split(None, 1)[0])
sorted_x = sorted(x, key=sort_key, reverse=True)
print(sorted_x)

      

So, an key

anonymous function (a lambda

) is given to the argument , and this function takes one argument, a string to sort. This string is split into spaces only once ( line.split(None, 1)

and the first element of this splitting is converted to an integer:

>>> '11 Hello World'.split(None, 1)
['11', 'Hello World']
>>> '11 Hello World'.split(None, 1)[0]
'11'
>>> int('11 Hello World'.split(None, 1)[0])
11

      

You can also improve the rest of your implementation; there is no need to call file.readlines()

as the files are iterable. sorted()

will accept everything in an iterable and sorted way, so you can just pass the entire file object directly to the function.



You also want to process files so that they close automatically when finished. Take advantage of the fact that they are context managers; the statement with

will signal them when the context is "complete" (completed) and the file objects are automatically closed:

sort_key = lambda line: int(line.split(None, 1)[0])

with open('3a.txt','r') as fo:
    sorted_x = sorted(fo, key=sort_key, reverse=True)

print(sorted_x)

      

Demo:

>>> from io import StringIO
>>> fo = StringIO('''\
... 5 Helen 
... 4 judy
... 25 Hanna
... 11 Elsa
... 8 Rachel
... '''
... )
>>> sort_key = lambda line: int(line.split(None, 1)[0])
>>> sorted(fo, key=sort_key, reverse=True)
['25 Hanna\n', '11 Elsa\n', '8 Rachel\n', '5 Helen \n', '4 judy\n']

      

You can accomplish this and still operator.itemgetter(0)

only use if you split your strings into lists first where the element 0

is an integer:

import operator

sort_key = operator.itemgetter(0)

with open('3a.txt','r') as fo:
    split_lines = (line.split(None, 1) for line in fo)
    numeric_lines = ((int(line[0]), line[1]) for line in split_lines)
    sorted_x = sorted(numeric_lines, key=sort_key, reverse=True)

print(sorted_x)

      

This uses generator expressions to process strings as they are read. However, you now have a list with each element of the tuple, an integer, and the rest of your string:

>>> import operator
>>> fo = StringIO('''\
... 5 Helen 
... 4 judy
... 25 Hanna
... 11 Elsa
... 8 Rachel
... ''')
>>> sort_key = operator.itemgetter(0)
>>> split_lines = (line.split(None, 1) for line in fo)
>>> numeric_lines = ((int(line[0]), line[1]) for line in split_lines)
>>> sorted(numeric_lines, key=sort_key, reverse=True)
[(25, 'Hanna\n'), (11, 'Elsa\n'), (8, 'Rachel\n'), (5, 'Helen \n'), (4, 'judy\n')]

      

+11


source







All Articles