How can I ask for a key from a subsection in a dictionary in Python?

If I have a dictionary in the dictionary, how can I query the key at a constant time? For example:

def get_hobby(hobby):
    d = {'An' : {'Hobby': "Paintball", 'Age' : 22}, 'Jef' : {'Hobby' : "Football", 'Age': 24}, 'Jos' : {'Hobby': "Paintball", 'Age' : 46}}
assert get_hobby("Paintball") == ['An', 'Jos']

      

This does not work:

return d.keys[hobby]

      

+3


source to share


3 answers


Use a list comprehension:

return [name for name, props in d.items() if props['Hobby'] == hobby]

      

d.items()

gives you a sequence of pairs (key, value)

where the value is a nested dictionary. List comprehension filters them by matching the variable hobby

with the nested key 'Hobby'

, creating a list of names for which the filter test returns True

.

You cannot query for keys at a constant time, because this number is variable.

Demo:

>>> def get_hobby(hobby):
...     d = {'An' : {'Hobby': "Paintball", 'Age' : 22}, 'Jef' : {'Hobby' : "Football", 'Age': 24}, 'Jos' : {'Hobby': "Paintball", 'Age' : 46}}
...     return [name for name, props in d.items() if props['Hobby'] == hobby]
... 
>>> get_hobby("Paintball")
['Jos', 'An']

      



Note that the list of keys returned is in no particular order, because the dictionaries are not in a given order. You can't just test this list against another list and expect it to be equal every time, because the lists are in order. The exact order depends on the Python hash seed and the dictionary insertion and deletion history.

You can return a set instead; the sets are also out of order and better reflect the nature of the matched keys returned:

return {name for name, props in d.items() if props['Hobby'] == hobby}

      

after which your statement becomes:

assert get_hobby("Paintball") == {'An', 'Jos'}

      

+3


source


This should work:

return [key for key, val in d.items() if val['Hobby'] == hobby]

      

For example:



def get_hobby(hobby):
    d = {
        'An': {'Hobby': "Paintball", 'Age' : 22},
        'Jef': {'Hobby' : "Football", 'Age': 24},
        'Jos' : {'Hobby': "Paintball", 'Age' : 46}
    }
    return [key for key, val in d.items() if val['Hobby'] == hobby]

print get_hobby("Paintball")

      

Result:

['Jos', 'An']

      

+1


source


If you need to make a lot of these queries at a constant time, you should go to the appropriate data structure. For example:

d2 = {}
for name, subdict in d.items():
    for key, value in subdict:
        d2.setdefault((key, value), set()).add(name)

      

(Note that I used set

, not list

; Martijn Pieters answer explains why.)

Now:

d2['Hobby', 'Paintball']

      

Simple and effective.

Of course, building the data structure doesn't take constant time; it obviously needs to iterate over every sub-element of every element of your whole dict. But you only do it once and then all your zillion requests are constant time. So, as long as you can afford the space, and "zillion" is actually a lot, this is the optimization you want.

You will need to rebuild your code so that the actual file is created once, not every time it is called get_hobbies

. Whether it means including this class using a closure, explicitly storing memoizing in the attribute wrapped in this function, or just using a global one that is built at the top level is up to you. Taking the latter, just because it's the shortest (it's probably not the best):

d = {'An' : {'Hobby': "Paintball", 'Age' : 22}, 'Jef' : {'Hobby' : "Football", 'Age': 24}, 'Jos' : {'Hobby': "Paintball", 'Age' : 46}}
d2 = {}
for name, subdict in d.items():
    for key, value in subdict:
        d2.setdefault((key, value), set()).add(name)

def get_hobby(hobby):
    return d2['Hobby', hobby]

assert get_hobby("Paintball") == {'An', 'Jos'}

      

0


source







All Articles