Inevitable Hashing collisions?

If I create a new one Map

:

Map<Integer,String> map = new HashMap<Integer,String>();

      

Then I call a map.put()

bunch of times each with a unique key, say a million times, will there ever be a collision or the java hashing algorithm doesn't guarantee any collisions if the key is unique?

+3


source to share


3 answers


Hashing does not guarantee that there will be no collisions if the key is unique. In fact, the only thing that is required is that objects that are equal have the same hashcode. The number of collisions determines how efficient the search is (less collisions, closer to O (1), more collisions, closer to O (n)).

Which hashcode object will depend on what type it is. For example, the default string hashcode

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

      



which necessarily simplifies the complexity of the string to a single number - it is definitely possible to achieve the same hashcode with two different strings, although that would be pretty rare.

If two things haveh is the same, hashmap uses .equals

to determine if a particular key matches. This is why it is so important that you go around both hashCode()

and equals()

together and make sure all the same hashcodes have the same hashcode.

+3


source


Hashtable works somewhat like this:

  • A hash map with initial capacity (or number of buckets) is generated

  • Every time you add an object to it, java calls the hash function of the key, number, and then modulo that to the current size of the hash table

  • The object is stored in the bucket with the result from step 2.



So, even if you have unique keys, they can still collide if you don't have as many buckets as your hash range of your key.

+1


source


You need to know two things:

  • Even in a collision, it won't cause a problem, because there is a list for each bucket. If you put in a bucket that already has value inside, it is simply added to the list. When fetching, first figure out which bucket to search for, and from the bucket, go through each value in the list and find out the one that is (by calling equals()

    )

  • If you are nesting millions of values ​​in a Hashmap, you may be wondering if each linked list on the map will contain thousands of values. Then we always do a big linear search, which will be slow. Then you need to know that the Java HashMap will change every time the number of entries is greater than a certain threshold (look at the capacity and loadFactor in the Javadoc). With a properly implemented hash code, the number of entries in each bucket will be small.

+1


source







All Articles