Sorting two lists in python?
I am counting some occurrences of words in a text and I have two lists: the first contains words, the second contains occurrences.
So, at the end of the analysis, I have something like
listWords : ["go", "make", "do", "some", "lot"]
listOccurrences: [2, 4, 8, 1, 5]
And I want to sort these two lists after listOccurrences DESC, so I would end up with:
listWords : ["do", "lot", "make", "go", "some"]
listOccurrences: [8, 5, 4, 2, 1]
Is there a way to do this? Or do you know any other way more "natural" than two lists? (Like one "list" where each event refers to a word)
source to share
>>> listWords = ["go", "make", "do", "some", "lot"]
>>> listOccurrences = [2, 4, 8, 1, 5]
>>> listTmp = zip(listOccurrences, listWords)
>>> listTmp
[(2, 'go'), (4, 'make'), (8, 'do'), (1, 'some'), (5, 'lot')]
>>> listTmp.sort(reverse=True)
>>> listTmp
[(8, 'do'), (5, 'lot'), (4, 'make'), (2, 'go'), (1, 'some')]
>>> zip(*listTmp)
[(8, 5, 4, 2, 1), ('do', 'lot', 'make', 'go', 'some')]
>>> listOccurrences, listWord = zip(*listTmp)
Note that the obvious data type for a key: value pair (here: word: count) is dict
. FWIW, you can take a look collections.Counter
.
Edit: for the sake of completeness: you can also use a built-in function sorted()
instead list.sort()
if you want to cram it all into a single line statement (which might not be such a good read / read idea, but that's a different story):
>>> listWords = ["go", "make", "do", "some", "lot"]
>>> listOccurrences = [2, 4, 8, 1, 5]
>>> listOccurrences, listWords = zip(*sorted(zip(listOccurrences, listWords), reverse=True))
>>> listWords
('do', 'lot', 'make', 'go', 'some')
>>> listOccurrences
(8, 5, 4, 2, 1)
source to share
Another way to do this is to have your data in a dictionary. Since you are counting the occurrence of a word, therefore listwords will have unique words and use can use it as a dictionary key. You can use python sorted method to sort keys and values in the same order.
listWords = ["go", "make", "do", "some", "lot"]
listOccurrences = [2, 4, 8, 1, 5]
dict = {}
i=0
while(len(listWords) > i):
dict[listWords[i]] = listOccurrences[i];
i = i + 1
print sorted(dict, key=dict.get, reverse=True)
print sorted(dict.values(), reverse=True)
source to share
I would use Counter . Here's a pointless one-liner :)
from collections import Counter
listWords, listOccurences = map(list, zip(*Counter(dict(zip(listWords, listOccurrences))).most_common()))
And as a readable code, you should use:
from collections import Counter
listWords = ["go", "make", "do", "some", "lot"]
listOccurrences = [2, 4, 8, 1, 5]
counter = Counter(dict(zip(listWords, listOccurrences)))
print(str(counter))
# Counter({'do': 8, 'lot': 5, 'make': 4, 'go': 2, 'some': 1})
# Want lists again?
listWords, listOccurences = map(list, zip(*counter.most_common()))
print(listWords)
# ['do', 'lot', 'make', 'go', 'some']
print(listOccurrences)
# [8, 5, 4, 2, 1]
Neat conversion back to lists provided thanks to Jon Clements .
Alternatively, you can use Counter
to collect frequency data in the first place (from here ):
import collections
c = collections.Counter()
with open('/home/me/my_big_file_o_words') as f:
for line in f:
c.update(line.rstrip().lower())
print('Words ordered by most common:')
for letter, count in c.most_common():
print(letter + ": " + count)
Finally: he thought it was classy to use underscores in variable names in Python rather than camelCase. Maybe change to list_words
and list_occurrences
? :)
source to share
One-liner:
[listWords[i] for i, k in sorted(enumerate(listOccurrences), key=itemgetter(1), reverse=True)]
i.e:.
In [62]: from operator import itemgetter
In [63]: listWords = ["go", "make", "do", "some", "lot"]
In [64]: listOccurrences = [2, 4, 8, 1, 5]
In [65]: [listWords[i] for i, k in sorted(enumerate(listOccurrences), key=itemgetter(1), reverse=True)]
Out[65]: ['do', 'lot', 'make', 'go', 'some']
source to share