Hashable Objects in Python

An object is considered hashable if it has a hash value which never changes during its lifetime i.e. it's immutable, and it can be compared to other objects. In Python, integer, boolean, string, & tuple are hashable objects. Hashable objects can be used as keys in dictionaries and elements in sets. Mutable objects, such as lists, sets, dictionaries, and user defined objects are not hashable. The way to generate a hash value is by using the hash() function. Here we will see an example of comparing immutable objects to find duplicates. from typing import Any def findDuplicates(nums: Any): seen = set() duplicates = [] for n in nums: if n in seen: duplicates.append(n) else: seen.add(n) return duplicates numList = [1, 2, 3, 3, 5, 5, 6] print(findDuplicates(numList)) # [3, 5] strList = ["a", "b", "c", "c", "5", "5", "z"] print(findDuplicates(strList)) # ["c", "5"] boolList = [True, True, False] print(findDuplicates(boolList)) # [True] tupleList = [(1, "Adam"), (1, "Adam"), (2, "Bruce")] print(findDuplicates(tupleList)) # [(1, "Adam")] The in keyword compares the contents in a dictionary (keys) and set (values) to check if the new value is present in it or not. Sets, dictionaries or user defined objects cannot be used as keys for a dictionary or added to sets right away, because their values can be changed. In this article, we will look at ways to make some of the non-hashable user defined objects, hashable. User Defined object For an object to be hashable, we should define a way to generate a hash, which will be used to compare it to other objects from the same class. class User: def __init__(self, id: int, name: str): self.id = id self.name = name def __repr__(self): """For readability""" return f"({self.id},{self.name})" userList = [User(1, "Adam"), User(1, "Adam"), User(2, "Bruce")] print(findDuplicates(userList)) # [] In the above example, no duplicates are detected from a list of simple class. The set seen does contain all the 3 classes {(2,Bruce), (1,Adam), (1,Adam)} but it can't find the similarity. So how do you make an object unique in terms of its contents. The answer is by implementing the dunder method __hash__() for the object. For an object, it's not just enough to implement the __hash__() method. It also requires the user to implement __eq__() method to allow comparison of the hash values. That's how the in keyword can do its job. The data members specified in the __hash__() and __eq__() methods will be used to generate and compare the hash values. Let's look at an example of a simple User class on how the __hash__() and __eq__() methods look. class User: def __init__(self, id: int, name: str): self.id = id self.name = name def __eq__(self, other): """Compare to check equality""" return ( # instance of the same class isinstance(other, User) and self.id == other.id and self.name == other.name ) def __hash__(self): """Generate hash value for this instance""" return hash((self.id, self.name)) def __repr__(self): """For readability""" return f"({self.id},{self.name})" userList = [User(1, "Adam"), User(1, "Adam"), User(2, "Bruce")] print(findDuplicates(userList)) # [(1,Adam)] The above class shows that the id and name of an User are used to generate a unique hash value, and they are also used in comparing two different User instances. It is able to detect an object as duplicate. For better understanding, the set seen contains {(2,Bruce), (1,Adam)} only two objects. Just by implementing two dunder methods, we are able to use our findDuplicates method to find duplicates in any arbitrary object. This is good for scalability, where a developer just needs to implement __hash__() and __eq__() methods as per teh application requirement and be assured that the findDuplicates method will do its job. That was for user defined object. What about sets and dictionaries. There are various ways you can achieve this, and there is some good discussion [2] on which way is the best. For completeness, I would like to list the ways I would generate a has value for a set and a dictionary. Set & Dictionary def findDuplicates(nums: List[Any]): """Find duplicates in list of any type""" seen = set() duplicates = [] for n in nums: tmp = n if isinstance(n, set): tmp = hash(frozenset(tmp)) if isinstance(n, dict): tmp = hash(frozenset(tmp.items())) if tmp in seen: duplicates.append(n) else: seen.add(tmp) return duplicates This is how the final findDuplicates method will look like which will work for arrays of integers, strings, sets, dictionaries, and also user defined objects, after implementing the __hash_

Mar 15, 2025 - 19:37
 0
Hashable Objects in Python

An object is considered hashable if it has a hash value which never changes during its lifetime i.e. it's immutable, and it can be compared to other objects.

In Python, integer, boolean, string, & tuple are hashable objects. Hashable objects can be used as keys in dictionaries and elements in sets. Mutable objects, such as lists, sets, dictionaries, and user defined objects are not hashable.

The way to generate a hash value is by using the hash() function.

Here we will see an example of comparing immutable objects to find duplicates.

from typing import Any

def findDuplicates(nums: Any):
    seen = set()
    duplicates = []
    for n in nums:
        if n in seen:
            duplicates.append(n)
        else:
            seen.add(n)
    return duplicates

numList = [1, 2, 3, 3, 5, 5, 6]
print(findDuplicates(numList)) # [3, 5]

strList = ["a", "b", "c", "c", "5", "5", "z"]
print(findDuplicates(strList)) # ["c", "5"]

boolList = [True, True, False]
print(findDuplicates(boolList)) # [True]

tupleList = [(1, "Adam"), (1, "Adam"), (2, "Bruce")]
print(findDuplicates(tupleList)) # [(1, "Adam")]

The in keyword compares the contents in a dictionary (keys) and set (values) to check if the new value is present in it or not.

Sets, dictionaries or user defined objects cannot be used as keys for a dictionary or added to sets right away, because their values can be changed.

In this article, we will look at ways to make some of the non-hashable user defined objects, hashable.

User Defined object

For an object to be hashable, we should define a way to generate a hash, which will be used to compare it to other objects from the same class.

class User:

    def __init__(self, id: int, name: str):
        self.id = id
        self.name = name

    def __repr__(self):
        """For readability"""
        return f"({self.id},{self.name})"

userList = [User(1, "Adam"), User(1, "Adam"), User(2, "Bruce")]
print(findDuplicates(userList)) # []

In the above example, no duplicates are detected from a list of simple class. The set seen does contain all the 3 classes {(2,Bruce), (1,Adam), (1,Adam)} but it can't find the similarity.

So how do you make an object unique in terms of its contents.

The answer is by implementing the dunder method __hash__() for the object.

For an object, it's not just enough to implement the __hash__() method. It also requires the user to implement __eq__() method to allow comparison of the hash values. That's how the in keyword can do its job.

The data members specified in the __hash__() and __eq__() methods will be used to generate and compare the hash values.

Let's look at an example of a simple User class on how the __hash__() and __eq__() methods look.

class User:

    def __init__(self, id: int, name: str):
        self.id = id
        self.name = name

    def __eq__(self, other):
        """Compare to check equality"""
        return (
            # instance of the same class
            isinstance(other, User)
            and self.id == other.id
            and self.name == other.name
        )

    def __hash__(self):
        """Generate hash value for this instance"""
        return hash((self.id, self.name))

    def __repr__(self):
        """For readability"""
        return f"({self.id},{self.name})"

userList = [User(1, "Adam"), User(1, "Adam"), User(2, "Bruce")]
print(findDuplicates(userList)) # [(1,Adam)]

The above class shows that the id and name of an User are used to generate a unique hash value, and they are also used in comparing two different User instances.

It is able to detect an object as duplicate. For better understanding, the set seen contains {(2,Bruce), (1,Adam)} only two objects. Just by implementing two dunder methods, we are able to use our findDuplicates method to find duplicates in any arbitrary object.

This is good for scalability, where a developer just needs to implement __hash__() and __eq__() methods as per teh application requirement and be assured that the findDuplicates method will do its job.

That was for user defined object. What about sets and dictionaries.
There are various ways you can achieve this, and there is some good discussion [2] on which way is the best.

For completeness, I would like to list the ways I would generate a has value for a set and a dictionary.

Set & Dictionary

def findDuplicates(nums: List[Any]):
    """Find duplicates in list of any type"""
    seen = set()

    duplicates = []
    for n in nums:
        tmp = n
        if isinstance(n, set):
            tmp = hash(frozenset(tmp))
        if isinstance(n, dict):
            tmp = hash(frozenset(tmp.items()))
        if tmp in seen:
            duplicates.append(n)
        else:
            seen.add(tmp)

    return duplicates

This is how the final findDuplicates method will look like which will work for arrays of integers, strings, sets, dictionaries, and also user defined objects, after implementing the __hash__() and __eq__() methods.

References

  1. https://hyperskill.org/learn/step/7823
  2. https://stackoverflow.com/questions/1151658/python-hashable-dicts