How to sort specific information in a file

I have a ready-made text file that has people names and ratings. Each has three points, each separated by a tab.

John    12    13    21
Zack    14    19    12
Tim     18    22    8
Jill    13    3     22

      

Now my goal is to sort the names alphabetically, displaying only the highest value. To look like this:

Jill   22
John   21
Tim    18
Zack   19

      

Once the file has been sorted, I want to print it in the python shell. I have defined the code because I am going to embed it in my other code that I have created.

from operator import itemgetter

def highscore():
    file1 = open("file.txt","r")
    file1.readlines()
    score1 = file1(key=itemgetter(1))
    score2 = file1(key=itemgetter(2))
    score3 = file1(key=itemgetter(3))


def class1alphabetical():
    with open('file.txt') as file1in:
        lines = [line.split('/t') for line in file1in]
        lines.sort()
    with open('file.txt', 'w') as file1out:
        for el in lines:
            file1out.write('{0}\n'.format(' '.join(el)))
    with open('file.txt','r') as fileqsort:
        for line in file1sort:
            print(line[:-1])
        file1sort.close

classfilealphabetical()

      

I used information from other questions such as Sorting information from a file in python and Python: Sorting a file by an arbitrary column where the column contains time values

However, I still stick with what to do now.

+3


source to share


4 answers


you seem to be doing something too hard.

This is a rough idea.

#this will get your folks in alpha
lines = f.readlines()
lines.sort()

#now, on each line, you want to split (that attrgetter is too complicated and
#blows up if <> 3 grades.

# use the special feature of split() with no parameter to remove all spaces and \t characters
fields = line.split()
name, grades = fields[0], fields[1:]

#cast your grades to integers  
grades = [int(grade) for grade in grades]

#sort and pick the last one
grades.sort()
highest = grades[-1]

#or... use max as suggested
highest = max(grades)

#write to output file....

      



another tip, use open with context managers for your files, they can be nested. Closing resources is a core component of well managed pgms.

with open("/temp/myinput.txt","r") as fi:
    ....

      

+2


source


Once you have strings in a sorted list, try this:

output = ["{} {}".format(i[0], max(i[1:], key=int)) for i in lines]

for i in output:
    print i

Jill 22
John 21
Tim 22
Zack 19

      

output

is a list created using a list .



Curly braces (' {}

') are replaced by arguments passed to str.format()

. str

in this case"{} {}"

The function max

takes a key argument, "key", as shown above, which allows you to specify a function to apply to each item in the iterable specified in max

(Iterable in this case I [1:]). I used int

because all the elements in the list were strings (containing numbers) and had to be converted to int

s.

0


source


It's pretty easy to do with some built-in functionality and interactions:

Code:

#!/usr/bin/env python


from operator import itemgetter


scores = """\
John\t12\t13\t21\n
Zack\t14\t19\t12\n
Tim\t18\t22\t8\n
Jill\t13\t3\t22"""


datum = [x.split("\t") for x in filter(None, scores.split("\n"))]
for data in sorted(datum, key=itemgetter(0)):
    name, scores = data[0], map(int, data[1:])
    max_score = max(scores)
    print "{0:s} {1:d}".format(name, max_score)

      

Output:

$ python -i scores.py 
Jill 22
John 21
Tim 22
Zack 19
>>> 

      

0


source


There are two tasks:

  • keep only the top mark
  • sort strings by name alphabetically

Here's a separate script that removes all but the highest score from each row:

#!/usr/bin/env python3
import sys
import fileinput

try:
    sys.argv.remove('--inplace') # don't modify file(s) unless asked
except ValueError:
    inplace = False
else:
    inplace = True # modify the files given on the command line

if len(sys.argv) < 2:
    sys.exit('Usage: keep-top-score [--inplace] <file>')

for line in fileinput.input(inplace=inplace):
    name, *scores = line.split() # split on whitespace (not only tab)
    if scores:
        # keep only the top score
        top_score = max(scores, key=int)
        print(name, top_score, sep='\t')
    else:
        print(line, end='') # print as is

      

Example:

$ python3 keep_top_score.py class6Afile.txt

      


To print lines sorted by name:

$ sort -k1 class6Afile.txt

      

The result of the command sort

depends on your current locale, for example you can use LC_ALL=C

to sort by byte values.

Or, if you want a Python solution:

#!/usr/bin/env python
import sys
from io import open

filename = sys.argv[1] 
with open(filename) as file:
    lines = file.readlines() # read lines

# sort by name
lines.sort(key=lambda line: line.partition('\t')[0])

with open(filename, 'w') as file:
    file.writelines(lines) # write the sorted lines

      

The names are sorted as Unicode text here. You can provide an explicit character encoding used in the file, otherwise the default (based on your locale) encoding is used.

Example:

$ python sort_inplace_by_name.py class6Afile.txt

      

Result

Jill    22
John    21
Tim 22
Zack    19

      

0


source







All Articles