Python nested defaultdict with mixing datatype
So how can I create a defaultdict for this:
{
'branch': {
'count': 23,
'leaf': {
'tag1': 30,
'tag2': 10
}
},
}
so i get zeros for count
, tag1
and tag2
by default? I want to dynamically populate a dictation while I read input. When I see a new one branch
, I want to create a dict with count
as zero and empty as a sheet. When I receive leaf
, I want to create a key with its name and set the value to zero.
Update : Accepted Martijn's answer as he has more points, but the other answers are equally good.
source to share
You cannot do this with defaultdict
, because the factory does not have access to the key.
However, you can simply subclass dict
your own "smart" defaultdict
-like class. Provide a custom __missing__
one that adds values based on the key:
class KeyBasedDefaultDict(dict):
def __init__(self, default_factories, *args, **kw):
self._default_factories = default_factories
super(KeyBasedDefaultDict, self).__init__(*args, **kw)
def __missing__(self, key):
factory = self._default_factories.get(key)
if factory is None:
raise KeyError(key)
new_value = factory()
self[key] = new_value
return new_value
Now you can specify your own display:
mapping = {'count': int, 'leaf': dict}
mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)
tree = KeyBasedDefaultDict(mapping)
Demo:
>>> mapping = {'count': int, 'leaf': dict}
>>> mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)
>>> tree = KeyBasedDefaultDict(mapping)
>>> tree['branch']['count'] += 23
>>> tree['branch']['leaf']['tag1'] = 30
>>> tree['branch']['leaf']['tag2'] = 10
>>> tree
{'branch': {'count': 23, 'leaf': {'tag1': 30, 'tag2': 10}}}
source to share
Answering my own question, but I think this will work as well:
def branch():
return {
'count': 0,
'leaf': defaultdict(int)
}
tree = defaultdict(branch)
tree['first_branch']['leaf']['cat2'] = 2
print json.dumps(tree, indent=2)
# {
# "first_branch": {
# "count": 0,
# "leaf": {
# "cat2": 2
# }
# }
# }
source to share
An object has __dict__
one that stores data and allows you to programmatically set default values. There is also an object called Counter
which I think you should use to delegate your leaf counting.
Thus, I recommend using an object with .Counter collections:
import collections
class Branch(object):
def __init__(self, leafs=(), count=0):
self.leafs = collections.Counter(leafs)
self.count = count
def __repr__(self):
return 'Branch(leafs={0}, count={1})'.format(self.leafs, self.count)
BRANCHES = [Branch(['leaf1', 'leaf2']),
Branch(['leaf3', 'leaf4', 'leaf3']),
Branch(['leaf6', 'leaf7']),
]
And usage:
>>> import pprint
>>> pprint.pprint(BRANCHES)
[Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=0),
Branch(leafs=Counter({'leaf3': 2, 'leaf4': 1}), count=0),
Branch(leafs=Counter({'leaf7': 1, 'leaf6': 1}), count=0)]
>>> first_branch = BRANCHES[0]
>>> first_branch.count += 23
>>> first_branch
Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=23)
>>> first_branch.leafs['leaf that does not exist']
0
>>> first_branch.leafs.update(['new leaf'])
>>> first_branch
Branch(leafs=Counter({'new leaf': 1, 'leaf1': 1, 'leaf2': 1}), count=23)
source to share