Why are (some) dict views hashed?

In python 3, methods keys()

, values()

and items()

provide dynamic representations of their respective elements. They were ported to python 2.7 and are available there as viewkeys

, viewvalues

and viewitems

. I mean them interchangeably here.

Is there a reasonable explanation for this:

#!/usr/bin/python3.4
In [1]: hash({}.keys())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-3727b260127e> in <module>()
----> 1 hash({}.keys())

TypeError: unhashable type: 'dict_keys'

In [2]: hash({}.items())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-decac720f012> in <module>()
----> 1 hash({}.items())

TypeError: unhashable type: 'dict_items'

In [3]: hash({}.values())
Out[3]: -9223363248553358775

      

I found this quite unexpected.


The python docs dictionary in the hashable dictionary says:

An object is hashable if it has a hash value that never changes during its lifetime (it needs a method __hash__()

), and can be compared to other objects (it needs a method __eq__()

). Hashable objects that compare equal must have the same hash value.

Well, the first part actually checks; it doesn't appear that the hash of an object dict_values

will change over its lifecycle - although its underlying values ​​certainly can.

In [11]: d = {}

In [12]: vals = d.values()

In [13]: vals.__hash__()
Out[13]: -9223363248553358718

In [14]: d['a'] = 'b'

In [15]: vals
Out[15]: dict_values(['b'])

In [16]: vals.__hash__()
Out[16]: -9223363248553358718

      

But the part about __eq__()

... well, she doesn't actually have any of them.

In [17]: {'a':'a'}.values().__eq__('something else')
Out[17]: NotImplemented

      

So ... yes. Can anyone figure this out? Is there a reason for this asymmetry that of the three methods viewfoo

, only dict_values

objects are hashed?

+3


source to share


1 answer


I believe it is because viewitems

, and viewkeys

provide custom comparison function, but it viewvalues

is not. Here are the definitions for each species:

PyTypeObject PyDictKeys_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_keys",                                /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    &dictviews_as_number,                       /* tp_as_number */
    &dictkeys_as_sequence,                      /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    dictview_richcompare,                       /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictkeys_iter,                 /* tp_iter */
    0,                                          /* tp_iternext */
    dictkeys_methods,                           /* tp_methods */
    0,
};

PyTypeObject PyDictItems_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_items",                               /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    &dictviews_as_number,                       /* tp_as_number */
    &dictitems_as_sequence,                     /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    dictview_richcompare,                       /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictitems_iter,                /* tp_iter */
    0,                                          /* tp_iternext */
    dictitems_methods,                          /* tp_methods */
    0,
};

PyTypeObject PyDictValues_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "dict_values",                              /* tp_name */
    sizeof(dictviewobject),                     /* tp_basicsize */
    0,                                          /* tp_itemsize */
    /* methods */
    (destructor)dictview_dealloc,               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)dictview_repr,                    /* tp_repr */
    0,                                          /* tp_as_number */
    &dictvalues_as_sequence,                    /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    0,                                          /* tp_hash */
    0,                                          /* tp_call */
    0,                                          /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
    0,                                          /* tp_doc */
    (traverseproc)dictview_traverse,            /* tp_traverse */
    0,                                          /* tp_clear */
    0,                                          /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    (getiterfunc)dictvalues_iter,               /* tp_iter */
    0,                                          /* tp_iternext */
    dictvalues_methods,                         /* tp_methods */
    0,
};

      

Note that tp_richcompare

both are defined dictview_richcompare

for items

and keys

, but not values

. Now the documentation for__hash__

says the following:

A class that overrides __eq__()

and does not define __hash__()

will have its __hash__()

implicitly set to None.

...

If the class that overrides __eq__()

needs to retain the implementation from __hash__()

the parent class, the interpreter needs to be told this explicitly by setting __hash__ = <ParentClass>.__hash__

.

If a class that does not override __eq__()

wants to suppress the backing hash, it should be included __hash__ = None

in the class definition. `



So, since items

/ is keys

overridden __eq__()

(by providing a function tp_richcompare

), they need to be explicitly defined __hash__

as equal to the parent in order to keep the implementation for it. Since it values

does not override __eq__()

, it inherits __hash__

from object

because tp_hash

and tp_richcompare

gets inherited from parent if they are both NULL :

This field is inherited by subtypes along with tp_richcompare: a subtype inherits both tp_richcompare and tp_hash when the tp_richcompare and tp_hash subtypes are NULL.

The fact that implantation for dict_values

does not prevent this automatic inheritance is likely to be considered a mistake.

+3


source







All Articles