Why are (some) dict views hashed?
In python 3, methods keys()
, values()
and items()
provide dynamic representations of their respective elements. They were ported to python 2.7 and are available there as viewkeys
, viewvalues
and viewitems
. I mean them interchangeably here.
Is there a reasonable explanation for this:
#!/usr/bin/python3.4
In [1]: hash({}.keys())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-3727b260127e> in <module>()
----> 1 hash({}.keys())
TypeError: unhashable type: 'dict_keys'
In [2]: hash({}.items())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-decac720f012> in <module>()
----> 1 hash({}.items())
TypeError: unhashable type: 'dict_items'
In [3]: hash({}.values())
Out[3]: -9223363248553358775
I found this quite unexpected.
The python docs dictionary in the hashable dictionary says:
An object is hashable if it has a hash value that never changes during its lifetime (it needs a method
__hash__()
), and can be compared to other objects (it needs a method__eq__()
). Hashable objects that compare equal must have the same hash value.
Well, the first part actually checks; it doesn't appear that the hash of an object dict_values
will change over its lifecycle - although its underlying values certainly can.
In [11]: d = {}
In [12]: vals = d.values()
In [13]: vals.__hash__()
Out[13]: -9223363248553358718
In [14]: d['a'] = 'b'
In [15]: vals
Out[15]: dict_values(['b'])
In [16]: vals.__hash__()
Out[16]: -9223363248553358718
But the part about __eq__()
... well, she doesn't actually have any of them.
In [17]: {'a':'a'}.values().__eq__('something else')
Out[17]: NotImplemented
So ... yes. Can anyone figure this out? Is there a reason for this asymmetry that of the three methods viewfoo
, only dict_values
objects are hashed?
source to share
I believe it is because viewitems
, and viewkeys
provide custom comparison function, but it viewvalues
is not. Here are the definitions for each species:
PyTypeObject PyDictKeys_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"dict_keys", /* tp_name */
sizeof(dictviewobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)dictview_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
(reprfunc)dictview_repr, /* tp_repr */
&dictviews_as_number, /* tp_as_number */
&dictkeys_as_sequence, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
0, /* tp_doc */
(traverseproc)dictview_traverse, /* tp_traverse */
0, /* tp_clear */
dictview_richcompare, /* tp_richcompare */
0, /* tp_weaklistoffset */
(getiterfunc)dictkeys_iter, /* tp_iter */
0, /* tp_iternext */
dictkeys_methods, /* tp_methods */
0,
};
PyTypeObject PyDictItems_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"dict_items", /* tp_name */
sizeof(dictviewobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)dictview_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
(reprfunc)dictview_repr, /* tp_repr */
&dictviews_as_number, /* tp_as_number */
&dictitems_as_sequence, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
0, /* tp_doc */
(traverseproc)dictview_traverse, /* tp_traverse */
0, /* tp_clear */
dictview_richcompare, /* tp_richcompare */
0, /* tp_weaklistoffset */
(getiterfunc)dictitems_iter, /* tp_iter */
0, /* tp_iternext */
dictitems_methods, /* tp_methods */
0,
};
PyTypeObject PyDictValues_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"dict_values", /* tp_name */
sizeof(dictviewobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)dictview_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
(reprfunc)dictview_repr, /* tp_repr */
0, /* tp_as_number */
&dictvalues_as_sequence, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
0, /* tp_doc */
(traverseproc)dictview_traverse, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
(getiterfunc)dictvalues_iter, /* tp_iter */
0, /* tp_iternext */
dictvalues_methods, /* tp_methods */
0,
};
Note that tp_richcompare
both are defined dictview_richcompare
for items
and keys
, but not values
. Now the documentation for__hash__
says the following:
A class that overrides
__eq__()
and does not define__hash__()
will have its__hash__()
implicitly set to None....
If the class that overrides
__eq__()
needs to retain the implementation from__hash__()
the parent class, the interpreter needs to be told this explicitly by setting__hash__ = <ParentClass>.__hash__
.If a class that does not override
__eq__()
wants to suppress the backing hash, it should be included__hash__ = None
in the class definition. `
So, since items
/ is keys
overridden __eq__()
(by providing a function tp_richcompare
), they need to be explicitly defined __hash__
as equal to the parent in order to keep the implementation for it. Since it values
does not override __eq__()
, it inherits __hash__
from object
because tp_hash
and tp_richcompare
gets inherited from parent if they are both NULL :
This field is inherited by subtypes along with tp_richcompare: a subtype inherits both tp_richcompare and tp_hash when the tp_richcompare and tp_hash subtypes are NULL.
The fact that implantation for dict_values
does not prevent this automatic inheritance is likely to be considered a mistake.
source to share