How do I implement __cmp __ () and __hash __ () for my class?

I would like to write a class that can be used as a key in hashable collections (for example in dict

). I know that custom classes are hashable by default, but usage id(self)

would be wrong.

It is contained in my class tuple

as a member variable. Getting from tuple

doesn't look like a parameter, because in my constructor I don't get the same arguments as the constructor tuple

. But perhaps this is not a limitation?

I basically need a hash tuple

of how this would give a real tuple.

hash(self.member_tuple)

does just that.

The idea here is that two tuples can be equal if they are id

not equal.

If I implement mine __cmp__()

like this:

def __cmp__(self, other):
    return cmp(self, other)

      

will this be automatically applied to hash(self)

for comparison? ... or should I implement it like this:

def __cmp__(self, other):
    return cmp(self.member_tuple, other)

      

My function is __hash__()

implemented to return the hash of the held tuple

, i.e .:

def __hash__(self):
    return hash(self.member_tuple)

      

Basically, how do __cmp__()

and interact __hash__()

? I don't know if there will be in the __cmp__()

other

hash or not, and whether it should be compared with "my" hash (which would be one of the held ones tuple

) or against self

.

So which one is correct?

Can anyone shed some light on this and maybe point me to the documentation?

+3


source to share


1 answer


I would not use __cmp__

and would use instead __eq__

. This is enough for hashing and you don't want to extend to sorting here. Moreover, __cmp__

it was removed from the Python 3 to the rich comparison methods ( __eq__

, __lt__

, __gt__

etc.).

Then yours __eq__

should return True if the member tuples are equal:

def __eq__(self, other):
    if not isinstance(other, ThisClass):
        return NotImplemented
    return self.member_tuple == other.member_tuple

      

Returning a NotImplemented

singleton when the object's type other

does not match is good practice because it delegates the equality test to the object other

; if it doesn't implement __eq__

or also returns NotImplemented

Python will fallback to standard test id()

.

Your implementation __hash__

is in place.



Since the hash doesn't have to be unique (it's just a slot picker in the hash table), then equality is used to determine if a matching key is already present or if a hash collision has occurred. Because such __eq__

(or __cmp__

, if __eq__

absent), is not called if the slot to which the object is hashed is empty.

This means that if two objects are considered equal ( a.__eq__(b)

returns True

), then their hash values ​​must also be equal. Otherwise, you may end up with a corrupted dictionary as Python will no longer be able to determine if a key is present in the hash table.

If both methods __eq__

and __hash__

delegate their responsibilities attribute self.member_tuple

, you retain the property; you can trust a base type tuple

that has been correctly implemented.

See hashing glossary definition and object.__hash__()

documentation
. If you're interested, I've written about how types work dict

and set

:

+5


source







All Articles