Summing nested words in a dictionary

I have a JSON file that I am reading as a dictionary. I have something like:

        "20101021": {
            "4x4": {
                "Central Spectrum": 5, 
                "Full Frame": 5, 
                "Custom": 1
            }, 
            "4x2": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "1x1": {
                "Central Spectrum": 5, 
                "Full Frame": 4
            }, 
        }, 
        "20101004": {
            "4x4": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "4x2": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "1x1": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }

      

etc. I am trying to calculate the sums (over all dates) for all combinations 1x1

, 4x2

(etc.) and Central Spectrum

and Full Frame

, in this example I would like to add 5

s.

What I have so far (using itertools

and Counter()

):

bins = map("x".join, itertools.product('124', repeat=2))
rois = ['Full Frame', 'Central Spectrum']
types = itertools.product(bins, rois)
c = collections.Counter(dict)
for type in types:
    print "%s : %d" % (type, c[type])

      

This displays a nice list of all combinations, but does not perform any actual summation of the values. You can help?

+1


source to share


1 answer


I may have misunderstood the expected end result, but you don't need counters ... Simple sum

might be enough if you know you will only have two levels of nesting.

Suppose you have loaded a dictionary of json

dictionaries into a variable named data

.

Then you can do:

results = {}
for key in data.keys():
    # key is '20101021', '20101004'...
    # data[key].keys() is '4x4, '4x2'... so let make sure
    # that the result dictionary contains all those '4x4', '4x2'
    # being zero if nothing better can be calculated.
    results[key] = dict.fromkeys(data[key].keys(), 0)

    for sub_key in data[key].keys():
        # sub_key is '4x4', '4x2'...
        # Also, don't consider a 'valid value' someting that is not a
        # "Central Spectrum" or a "Full Frame"
        valid_values = [
            int(v) for k, v in data[key][sub_key].items()
            if k in ["Central Spectrum", "Full Frame"]
        ]
        # Now add the 'valid_values'
        results[key][sub_key] = sum(valid_values)
print results

      

What are the outputs:

{
  u'20101021': {u'1x1': 9, u'4x4': 10, u'4x2': 10},
  u'20101004': {u'1x1': 10, u'4x4': 10, u'4x2': 10}
}

      



I have used dict.keys () on many occasions because perhaps that clears up the process? (well and once dict.items () ) You also have a dict. values ​​() (and all tree functions have iterator equivalents) that can shorten your code. Also look at what dict.fromkeys does .

EDIT (as per OP's comments on this answer)

If you want data to be added (or "collected") over time, you need to move results[key]

from the date string (as shown in the answer above) to 1x1

, 4x4

...

VALID_KEYS = ["Central Spectrum", "Full Frame"]
results = {}
for key_1 in data.keys():
    # key_1 is '20101021', '20101004'...

    for key_2 in data[key_1].keys():
        # key_2 is '4x4', '4x2'...
        if key_2 not in results:
            results[key_2] = dict.fromkeys(VALID_KEYS, 0)
        for key_3 in data[key_1][key_2].keys():
            # key_3 is 'Central Spectrum', 'Full Frame', 'Custom'...
            if key_3 in VALID_KEYS:
                results[key_2][key_3] += data[key_1][key_2][key_3]
print results

      

What are the outputs:

{
    u'1x1': {'Central Spectrum': 10, 'Full Frame': 9},
    u'4x4': {'Central Spectrum': 10, 'Full Frame': 10},
    u'4x2': {'Central Spectrum': 10, 'Full Frame': 10}
}

      

+2


source







All Articles