Sorted () sorts only by the first digit
I need to sort the first column of a table. it looks something like
6000 799
7000 352
8000 345
9000 234
10000 45536
11000 3436
1000 342
2000 123
3000 1235
4000 234
5000 233
I want the first column to be in ascending order, but it only sorts it by the first digit, not the whole column, i.e.
1000 342
10000 45536
11000 3436
2000 123
But I want
1000 342
2000 123
3000 1235
etc
Currently trying:
SortInputfile=open("InterpBerg1","r")
line=SortInputfile.readlines()
line.sort()
map(SortOutputfile.write, line)
source to share
The function sort
and sorted
supports a key argument that allows you to specify the key to be used to perform the sort. Since you want an ordinal sort order and an alphabetical sort order, you need to extract the first column and convert it to int:
SortInputfile=open("InterpBerg1","r")
line=SortInputfile.readlines()
line.sort(key=lambda line: int(line.split()[0]))
map(SortOutputfile.write, line)
A cleaner version of this could be:
# read input file
with open(input_filename) as fh:
lines = fh.readlines()
# sort lines
lines.sort(key=lambda line: int(line.split()[0]))
# write output file
with open(output_filename, 'w') as fh:
fh.writelines(lines)
source to share
For numeric ordering, you must convert strings to numbers. To do this on the fly, use the parameter key
:
outfile.writelines(sorted(
open('InterpBerg1'),
key = lambda l: int(l.split(maxsplit=1)[0])))
Edit: I agree with others suggesting to use instructions with
when working with files, so:
with open('Output', 'w') as outfile, open('InterpBerg1') as infile:
outfile.writelines(sorted(infile,
key = lambda l: int(l.split(maxsplit=1)[0])))
source to share
First, you should know that there are two standard ways to sort a list in Python. The first is this sorted()
, which is a generic built-in function that takes a list and returns a sorted copy of the list, and the second is .sort()
which is a built-in method for lists, sorting that list in -place (and returns None
). You are using .sort()
; no .sorted()
.
Second, the items in your list are not integers; these are strings. This can be said from the fact that you created a list using readlines()
which returns an array of strings. When you sort strings, they are sorted alphabetically by default. This is why they seem to be sorted by "first digit only" in your example.
To sort something else, you have two options, both of which are expressed as keyword parameters to the function sorted()
and .sort()
. The first, as mentioned in a couple of other answers, is a parameter key
that determines, roughly speaking, what quality or attribute of the list item you want to use for sorting; in your case you want to use the value of the first number. You can get this by splitting the string by space, taking the first token and converting to int. (Lev Levitsky and bikeshedder answer as showing the appropriate ways to do this). The value passed tokey
, must be a function (either a standard function or a lambda function) that takes an input list item and returns the desired value. Another parameter you can use is a parameter cmp
, which is a function that takes two list items (or their keys if you also define a parameter key
) as input , and returns a value indicating which item is "greater". "This is a slightly more complex function to use, but it adds a little more flexibility to sorting.
source to share