How do I initialize a counter from a list of key-value pairs?

Question

How do I initialize a counter from a list of key-value pairs?

If I have a sequence of pairs (key, value)

, I can quickly initialize the dictionary like this:

>>> data = [ ('a', 1), ('b', 2) ]
>>> dict(data) 
{'a': 1, 'b': 2}

I would like to do the same with a dictionary Counter

; but how? Both the constructor and the method update()

treat ordered pairs as keys, not key-value pairs:

>>> from collections import Counter
>>> Counter(data)
Counter({('a', 1): 1, ('b', 2): 1})

The best I could do is use a temporary dictionary, which is an ugly and unnecessary workaround:

>>> Counter(dict(data))
Counter({'b': 2, 'a': 1})

Is there a way to properly initialize Counter

from a list of pairs (key, count)

? My use case involves reading a large number of stored samples from files (with unique keys).

+3

python python-3.x data-structures counter python-internals

alexis 06 May '17 at 20:16

source to share

3 answers

From the docs :

Items are counted from iteration or initialized from another mapping (or counter)

So it is not, you need to convert it to a mapping and then initialize it Counter

. And yes, when you initialized with dict

, it was the right move.

UPDATE

I agree that @RaymondHettinger's code looks good and is actually faster

from collections import Counter
from random import choice
from string import ascii_letters
a=[(choice(ascii_letters), i) for i in range(100)]

Tested with Python 3.6.1 and IPython 6

Initialization with dict

:

%%timeit
c1=Counter(dict(a))

Output

12.1 µs ± 342 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Updating with dict.update()

%%timeit    
c2=Counter()
dict.update(c2, a)

Output:

7.21 µs ± 236 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

+1

vishes_shell 06 May '17 at 20:23

source to share

If your list of keys in pairs is (key, value)

already unique - no duplicates - you can use Raymond Hettinger's excellent solution .

Beware if you only get the last value for any given key, if there are duplicate keys:

>>> data=[ ('a', 1), ('b', 2), ('a', 3), ('b', 4) ]
>>> c=Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 4, 'a': 3})      # note 'a' and 'b' are only the last value...

Same with dict

:

>>> Counter(dict(data))
Counter({'b': 4, 'a': 3})

But Counters are most often used to count totals, including duplicates. If you want the sum of the records "a" and "b", you need to iterate over all the pairs:

>>> c=Counter()
>>> for k, v in data:
...    c[k]+=v
... 
>>> c
Counter({'b': 6, 'a': 4})        # the sum of the 'k' entries given 'v'

0

dawg May 07 '17 at 20:32

source to share

Raymond Hettinger · Accepted Answer · 2017-05-06T20:23:20+0000

I would just do a loop:

for obj, cnt in [ ('a', 1), ('b', 2) ]:
    counter[obj] = cnt

You can also just call the parent method dict.update

:

>>> from collections import Counter
>>> data = [ ('a', 1), ('b', 2) ]
>>> c = Counter()
>>> dict.update(c, data)
>>> c
Counter({'b': 2, 'a': 1})

Finally, there is nothing wrong with your original solution:

Counter(dict(list_of_pairs))

An expensive part of creating dictionaries or counters is hashing all keys and resizing them periodically. Once the dictionary is done, converting it to a counter is very cheap, about as fast as dict.copy (). The hash values are reused and the final hash hash table is predefined (no need to resize).

How do I initialize a counter from a list of key-value pairs?

More articles: