Nested dictionary comprehension

For the next nested dictionary, I would like to sum up the values for each key 'ab'

, 'bc'

, 'cd'

, 'de'

respectively. Basically, collapse the dictionary. Preferably using c comprehension =sum

, but cannot figure out the correct syntax:

{'hot': {'111': {'ab': 1, 'bc': 3, 'cd': 5, 'de': 7}}}
{'hot': {'111': {'ab': 12.5, 'bc': -31, 'cd': 2.5, 'de': 13}}}
{'hot': {'111': {'ab': 10, 'bc': 3, 'cd': 0, 'de': -2}}}

{'hot': {'110': {'ab': -1, 'bc': 0, 'cd': 1, 'de': 1}}}
{'hot': {'110': {'ab': 8, 'bc': 20, 'cd': 41, 'de': 13}}}
{'hot': {'110': {'ab': 1.75, 'bc': 2.3, 'cd': 6, 'de': 0}}}

{'hot': {'109': {'ab': 2.7, 'bc': 24, 'cd': 4, 'de': 5}}}
{'hot': {'109': {'ab': 41, 'bc': 6, 'cd': 12, 'de': 33}}}
{'hot': {'109': {'ab': 32, 'bc': 7, 'cd': 18, 'de': 3.75}}}

{'cold': {'111': {'ab': 25, 'bc': 2, 'cd': 3, 'de': 2.1}}}
{'cold': {'111': {'ab': 5, 'bc': 8, 'cd': 5, 'de': 17}}}
{'cold': {'111': {'ab': -71, 'bc': 42, 'cd': 5, 'de': 16}}}

{'cold': {'110': {'ab': 23, 'bc': 2.4, 'cd': 2.1, 'de': 4.3}}}
{'cold': {'110': {'ab': 11, 'bc': 2.8, 'cd': 4.5, 'de': 2.4}}}
{'cold': {'110': {'ab': 4, 'bc': 5.7, 'cd': 8.7, 'de': 1}}}        

      

Desired output:

dict['hot']['111'][AB] = 1 + 12.5 + 10 = 23.5
dict['hot']['111'][BC] = 3 - 31 + 3 = - 25

      

etc.

+3


source to share


2 answers


I am assuming that your data is in the list because doing so gives you the answers you expect.

data = [{'hot': {'111': {'ab': 1, 'bc': 3, 'cd': 5, 'de': 7}}},
{'hot': {'111': {'ab': 12.5, 'bc': -31, 'cd': 2.5, 'de': 13}}},
{'hot': {'111': {'ab': 10, 'bc': 3, 'cd': 0, 'de': -2}}},

{'hot': {'110': {'ab': -1, 'bc': 0, 'cd': 1, 'de': 1}}},
{'hot': {'110': {'ab': 8, 'bc': 20, 'cd': 41, 'de': 13}}},
{'hot': {'110': {'ab': 1.75, 'bc': 2.3, 'cd': 6, 'de': 0}}},

{'hot': {'109': {'ab': 2.7, 'bc': 24, 'cd': 4, 'de': 5}}},
{'hot': {'109': {'ab': 41, 'bc': 6, 'cd': 12, 'de': 33}}},
{'hot': {'109': {'ab': 32, 'bc': 7, 'cd': 18, 'de': 3.75}}},

{'cold': {'111': {'ab': 25, 'bc': 2, 'cd': 3, 'de': 2.1}}},
{'cold': {'111': {'ab': 5, 'bc': 8, 'cd': 5, 'de': 17}}},
{'cold': {'111': {'ab': -71, 'bc': 42, 'cd': 5, 'de': 16}}},

{'cold': {'110': {'ab': 23, 'bc': 2.4, 'cd': 2.1, 'de': 4.3}}},
{'cold': {'110': {'ab': 11, 'bc': 2.8, 'cd': 4.5, 'de': 2.4}}},
{'cold': {'110': {'ab': 4, 'bc': 5.7, 'cd': 8.7, 'de': 1}}}  ]

      

And the code is like this:

from collections import defaultdict
counts = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))

for d in data:                   # for the list
    for k1 in d:                 # for the hot-cold level
        for k2 in d[k1]:         # for the 1[0-9]{2} level
            for k3 in d[k1][k2]: # for the [a-z]{2} level
                counts[k1][k2][k3] += d[k1][k2][k3]

print(counts['hot']['111']['ab'])
print(counts['hot']['111']['bc'])

      



There are two levels of decomposition of defaultdict.

Output:

23.5
-25

      

+1


source


This example creates a "getter" function. This can reduce a bit of overhead compared to parsing the entire dict list at once.

The double dictionary iteration here can be reduced by simply parsing the received dictionaries in the first iteration, however it is split into accepted

with the second iterator for demonstration purposes.

Here is the complete sample code that displays the desired result 23.5

. First, create a list of dictionaries you want to read from:

dictionaries = [
    {'hot': {'111': {'ab': 1, 'bc': 3, 'cd': 5, 'de': 7}}},
    {'hot': {'111': {'ab': 12.5, 'bc': -31, 'cd': 2.5, 'de': 13}}},
    {'hot': {'111': {'ab': 10, 'bc': 3, 'cd': 0, 'de': -2}}},

    {'hot': {'110': {'ab': -1, 'bc': 0, 'cd': 1, 'de': 1}}},
    {'hot': {'110': {'ab': 8, 'bc': 20, 'cd': 41, 'de': 13}}},
    {'hot': {'110': {'ab': 1.75, 'bc': 2.3, 'cd': 6, 'de': 0}}},

    {'hot': {'109': {'ab': 2.7, 'bc': 24, 'cd': 4, 'de': 5}}},
    {'hot': {'109': {'ab': 41, 'bc': 6, 'cd': 12, 'de': 33}}},
    {'hot': {'109': {'ab': 32, 'bc': 7, 'cd': 18, 'de': 3.75}}},

    {'cold': {'111': {'ab': 25, 'bc': 2, 'cd': 3, 'de': 2.1}}},
    {'cold': {'111': {'ab': 5, 'bc': 8, 'cd': 5, 'de': 17}}},
    {'cold': {'111': {'ab': -71, 'bc': 42, 'cd': 5, 'de': 16}}},

    {'cold': {'110': {'ab': 23, 'bc': 2.4, 'cd': 2.1, 'de': 4.3}}},
    {'cold': {'110': {'ab': 11, 'bc': 2.8, 'cd': 4.5, 'de': 2.4}}},
    {'cold': {'110': {'ab': 4, 'bc': 5.7, 'cd': 8.7, 'de': 1}}}
]

      

Then create your function.

def get_sum(temp, num, pt):
    accepted = [] # Initialize a list of accepted dictionaries that fit the arguments passed.
    pt_sum = 0 # Initialize the variable for the sum of your parts, starting at 0.

    for dictionary in dictionaries: # Iterate through the dictionary list.
        if temp in dictionary and num in dictionary[temp]: # Check if the dict on current iteration has what you want.
            accepted.append(dictionary[temp][num]) # It does, so add it to accepted.
            # Let pause here. Say you are reading the first dict in the list. So that means, this is what the fuction is working with:
            # {'hot': {'111': {'ab': 1, 'bc': 3, 'cd': 5, 'de': 7}}}
            # Now with the append function, we are calling "dictionary[temp][num]".
            # We know that each of these keys exist, because we just checked it.
            # So this eliminates the need to add the whole dictionary to "accepted".
            # Basically, we are cutting out the last section, because that what we need. So you end up with:
            # "{'ab': 1, 'bc': 3, 'cd': 5, 'de': 7}" in the list "accepted".

    for dictionary in accepted: # Now go through the ones that have the data you need.
        pt_sum += dictionary[pt] # And simply add the value to the sum.

    return pt_sum # Return the part sum.

      



And now you can use it:

print(get_sum("hot", "111", "ab"))

>>> 23.5

      

The simplified code I mentioned at the top would be:

def get_sum(temp, num, pt):
    pt_sum = 0

    for dictionary in dictionaries:
        if temp in dictionary and num in dictionary[temp]:
            pt_sum += dictionary[temp][num][pt]

    return pt_sum

      

Essentially just adding to pt_sum

in the first loop, so there is no second iteration that was never required.

0


source







All Articles