How do I initialize a counter from a list of key-value pairs?
If I have a sequence of pairs (key, value)
, I can quickly initialize the dictionary like this:
>>> data = [ ('a', 1), ('b', 2) ]
>>> dict(data)
{'a': 1, 'b': 2}
I would like to do the same with a dictionary Counter
; but how? Both the constructor and the method update()
treat ordered pairs as keys, not key-value pairs:
>>> from collections import Counter
>>> Counter(data)
Counter({('a', 1): 1, ('b', 2): 1})
The best I could do is use a temporary dictionary, which is an ugly and unnecessary workaround:
>>> Counter(dict(data))
Counter({'b': 2, 'a': 1})
Is there a way to properly initialize Counter
from a list of pairs (key, count)
? My use case involves reading a large number of stored samples from files (with unique keys).
source to share
I would just do a loop:
for obj, cnt in [ ('a', 1), ('b', 2) ]:
counter[obj] = cnt
You can also just call the parent method dict.update
:
>>> from collections import Counter
>>> data = [ ('a', 1), ('b', 2) ]
>>> c = Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 2, 'a': 1})
Finally, there is nothing wrong with your original solution:
Counter(dict(list_of_pairs))
An expensive part of creating dictionaries or counters is hashing all keys and resizing them periodically. Once the dictionary is done, converting it to a counter is very cheap, about as fast as dict.copy (). The hash values ββare reused and the final hash hash table is predefined (no need to resize).
source to share
From the docs :
Items are counted from iteration or initialized from another mapping (or counter)
So it is not, you need to convert it to a mapping and then initialize it Counter
. And yes, when you initialized with dict
, it was the right move.
UPDATE
I agree that @RaymondHettinger's code looks good and is actually faster
from collections import Counter
from random import choice
from string import ascii_letters
a=[(choice(ascii_letters), i) for i in range(100)]
Tested with Python 3.6.1 and IPython 6
Initialization with dict
:
%%timeit c1=Counter(dict(a))
Output
12.1 Β΅s Β± 342 ns per loop (mean Β± std. dev. of 7 runs, 100000 loops each)
Updating with dict.update()
%%timeit c2=Counter() dict.update(c2, a)
Output:
7.21 Β΅s Β± 236 ns per loop (mean Β± std. dev. of 7 runs, 100000 loops each)
source to share
If your list of keys in pairs is (key, value)
already unique - no duplicates - you can use Raymond Hettinger's excellent solution .
Beware if you only get the last value for any given key, if there are duplicate keys:
>>> data=[ ('a', 1), ('b', 2), ('a', 3), ('b', 4) ]
>>> c=Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 4, 'a': 3}) # note 'a' and 'b' are only the last value...
Same with dict
:
>>> Counter(dict(data))
Counter({'b': 4, 'a': 3})
But Counters are most often used to count totals, including duplicates. If you want the sum of the records "a" and "b", you need to iterate over all the pairs:
>>> c=Counter()
>>> for k, v in data:
... c[k]+=v
...
>>> c
Counter({'b': 6, 'a': 4}) # the sum of the 'k' entries given 'v'
source to share