Hashmap implementation in C ++ :: hashing for templated datatype

I've been using the STL unordered_map lately, and while it works well, I don't quite understand how the hashing function works, given that the datatype is specified as a template parameter. To better understand this data structure, I implemented my own little Hashmap class in C ++:

Hashmap interface:

#ifndef _HASHMAP_H_
#define _HASHMAP_H_

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#include <vector.h>

//Beginning of Hashmap class definition

template <class Key, class Value>
class Hashmap{

int mappedElementCount;

explicit Hashmap();
virtual ~Hashmap();

virtual void test();

virtual int hash(Key*);

int* getSize();

void putKVPair(Key*,Value*);

void clearMap();

//When we use these methods, we'll want a linear vector of keys and values to 
    //iterate over, so vector is good here
std::vector<Key>* getKeys();
std::vector<Value>* getValues();

}; //end hashmap class definition
#endif /*_HASHMAP_H_*/


Hashmap implementation:

#include "Hashmap.h"

template<class Key,class Value> Hashmap<Key,Value>::Hashmap(){
mappedElementCount = 0;
template<class Key,class Value> Hashmap<Key,Value>::~Hashmap(){
printf("\nDestroying the base Hashmap object!\n");

template<class Key,class Value> void Hashmap<Key,Value>::test(){
printf("The size of our Key is %i and the size of our Value is

template<class Key,class Value> int Hashmap<Key,Value>::hash(Key* k_ptr){

    unsigned int hashval;

    /* we start our hash out at 0 */
    hashval = 0;

        //TODO: How do we generate a hash signature when we don't know what data type 
        //we're going to be working with?

    return hashval % mappedElementCount;


template<class Key,class Value> std::vector<Key>* Hashmap<Key,Value>::getKeys(){
//TODO: prepare a vector initialized with all Key objects and return it here
return keys;    

template<class Key,class Value> std::vector<Value>* Hashmap<Key,Value>::getValues(){
//TODO: prepare a vector initialized with all Value objects and return it here
return values;  

template<class Key,class Value> int* Hashmap<Key,Value>::getSize(){
return &mappedElementCount;

template<class Key,class Value> void Hashmap<Key,Value>::putKVPair(Key* k, Value* v){
    //TODO: implement hashing of the key object k to determine
    //the address of the value object v

    //first step, generate a hash from our key
    int tempHash = hash(k);

       //TODO: store the Value at an address given by or influenced by tempHash

    //If all was successfully completed, increment the mapped records counter

template<class Key,class Value> void Hashmap<Key,Value>::clearMap(){
//TODO: implement a cascading chain of deallocation of stored objects within the 
//MAYBE-- only if we create new objects rather than just mapping reference 
//which is really the goal here...  In the latter case, just empty the Hashmap 


One possible OOP way to solve this problem is to use Hashmap as the base class and provide derived classes that have well-known Key data types such as the following Stringmap:

String interface:

#ifndef _STRINGMAP_H_
#define _STRINGMAP_H_

#include "Hashmap.h"

template <class Value>
class Stringmap:public Hashmap<std::string,Value>{

//Con/de 'structors
explicit Stringmap();

//Here we know our Key will be of type std::string
//so we can generate our hash sig by char values
    //Override hash from the base class
int hash(std::string*);

//override test from base class
void test();

#endif /*_STRINGMAP_H_ def*/


String schema implementation:

#include "Stringmap.h"

template<class Value> Stringmap<Value>::Stringmap():Hashmap<std::string,Value>(){

template<class Value> Stringmap<Value>::~Stringmap(){
printf("\nDestroying the derived stringmap object!\n");

template<class Value> void Stringmap<Value>::test(){
printf("The size of our Value is %i\n",sizeof values[0]);

template<class Value> int Stringmap<Value>::hash(std::string* str_ptr){

    unsigned int hashval;

    /* we start our hash out at 0 */
    hashval = 0;

    /* for each character, we multiply the old hash by 31 and add the current
     * character.  Remember that shifting a number left is equivalent to
     * multiplying it by 2 raised to the number of places shifted.  So we
     * are in effect multiplying hashval by 32 and then subtracting hashval.
     * Why do we do this?  Because shifting and subtraction are much more
     * efficient operations than multiplication.
    for(int i=0;i<str_ptr->length();i++) {
        hashval = (*(str_ptr))[i] + ((hashval << 5) - hashval);

    /* we then return the hash value mod the hashmap size so that it will
     * fit into the necessary range
    return hashval % (*(Hashmap<std::string,Value>::getSize()));



So the question is: is it possible to create a hash signature when the data type to be hashed is currently unknown? If so, how? Looking at the std :: hash docs, it seems that the C ++ standard just defines a hash function for each primitive data type, and also for T * (for any type T) ... What's missing, how is this hashing implemented for a given primitive data types and, moreover, how it is implemented for a generic T *. I suppose I could just name the hash (Key) and hope for the best, but it would be nice to understand what's going on behind the scenes.

thanks CCJ


source to share

2 answers


2 receives explicit template (parameter Key

and Value

), and also has a pile template hidden parameters, from which the default hash function std::hash<Key>


This STL hash function std::hash<Key>

takes a Key

and returns a std::size_t

. It is already specialized for all integral types and std::string

. From this help site

A hash pattern defines a function object that implements a hash function. Instances of this functional object define an operator (), which:

  • It takes a single parameter of type Key.
  • Returns a value of type size_t that represents the hash value of the parameter.
  • Doesn't throw an exception when called.
  • For two identical parameters k1 and k2, std :: hash () (k1) == std :: hash () (k2).
  • For two different parameters k1 and k2 that are not equal, the probability that std :: hash () (k1) == std :: hash () (k2) should be very small, approaching 1.0 / std :: numeric_limits :: max ().

The hashing pattern is both CopyConstructible and Destructible. unordered associative containers std :: unordered_set, std :: unordered_multiset, std :: unordered_map, std :: unordered_multimap use template specializations std :: hash as default hash function.

The link ends with this quote:

** Actual hash functions are implementation dependent and are not required to meet any quality criteria other than those listed above. **

So, you can look at the implementation of your system, but this does not guarantee anything for the implementation of other systems.



There is a template std::hash<T>

that specializes in different types and that you can specialize for your own types.

By default, it std::unordered_map<T>

just delegates the hash to std::hash<T>

(or you can specify a different hash function as a template argument).

Thus, std::unordered_map

you don't need to know anything about the hashing mechanism.

How implemented std::hash

is not specified. However, I find it reasonable to assume that any decent compiler will provide a quality implementation. One of them should keep in mind that it std::hash<char*>

doesn't have a C hash string, it only hashes the pointer value (was there :))



All Articles