Python generator confusion

I have confusion what is wrong with my code:

users = [{'id': 1, 'name': 'Number1', 'age': 11},
         {'id': 2, 'name': 'Number2', 'age': 12},
         {'id': 3, 'name': 'Number3', 'age': 13},
         {'id': 4, 'name': 'Number4', 'age': 14}]

_keys = ('name', 'age')

data_by_user_id = {u.get('id'): (u.get(k) for k in _keys) for u in users}

      

data_by_user_id looks like this:

{1: <generator object <genexpr> at 0x7f3c12c31050>, 2: <generator object <genexpr> at 0x7f3c12c310a0>, 3: <generator object <genexpr> at 0x7f3c12c310f0>, 4: <generator object <genexpr> at 0x7f3c12c31140>}

      

but after iterating:

for user_id, data in data_by_user_id.iteritems():
    name, age = data
    print user_id, name, age

      

Result

differs from expected:

1 Number4 14
2 Number4 14
3 Number4 14
4 Number4 14

      

Can someone explain to me what I am doing wrong here? I know that I can use a list view instead of a generator, but I am trying to figure out what the problem is with my code

Thank!

+3


source to share


2 answers


As you probably already know, generator expressions are evaluated lazily. The evaluation is dict.get

deferred until the generator expression is consumed at what time the u

last dictionary of your list will be in the current scope:

>>> u = {'id': 1, 'name': 'Number1', 'age': 11}
>>> _keys = ('name', 'age')
>>> gen = (u.get(k) for k in _keys)
>>> # update u
>>> u = {'id': 4, 'name': 'Number4', 'age': 14}
>>> list(gen)
['Number4', 14]

      

One obvious way to fix this is to use a list . Another way, not as good as the first, is to put a generator expression in a function and bind the current value u

to that function using the default argument:



data_by_user_id = {u.get('id'): lambda x=u: (x.get(k) for k in _keys) for u in users}

for user_id, data in data_by_user_id.iteritems():
    name, age = data()
    print name, age

      


Number1 11
Number2 12
Number3 13
Number4 14

      

+2


source


Your expression in dictionary comprehension expression:

(u.get(k) for k in _keys)

      

is a generator expression. This means that you are creating a generator. A generator is an iterable object that lazily evaluates elements: it doesn't get elements from u

, it defers this operation until you, for example, call next(..)

on it to get the next element. So you are building a dictionary like this.

In the body of the loop, for

you write:

name, age = data

      

c data

is the value of the element. This now means that you are asking Python to "unpack" the iterable. This will work if the iterable yield is exactly the same as the number of variables on the left, so two in this case. As a result, you will exhaust the generator and get the results of the iterator. Then you print these items.



Note that after the loop, for

all dictionary values ​​will be exhausted by the generators, so your loop for

has side effects. To prevent this, you'd better materialize generators.

EDIT : Another problem is what you are using u

in comprehension of a vocabulary that is not well connected. As a result, if the variable changes , the result of the generators will also change . This is problematic, since at the end of understanding the dictionary, all generators will work with the last dictionary. u

You can fix the problem by creating a local scope:

{u.get('id'): (lambda u=u: (u.get(k) for k in _keys))() for u in users}
      

It now generates the expected output:

>>> users = [{'id': 1, 'name': 'Number1', 'age': 11},
...          {'id': 2, 'name': 'Number2', 'age': 12},
...          {'id': 3, 'name': 'Number3', 'age': 13},
...          {'id': 4, 'name': 'Number4', 'age': 14}]
>>> 
>>> _keys = ('name', 'age')
>>> data_by_user_id = {u.get('id'): (lambda u=u: (u.get(k) for k in _keys))() for u in users}
>>> for user_id, data in data_by_user_id.iteritems():
...     name, age = data
...     print user_id, name, age
... 
1 Number1 11
2 Number2 12
3 Number3 13
4 Number4 14

      

+3


source







All Articles