What data structure is appropriate for this situation?
I am trying to decide which datastructure to use to store key-value pairs when only the required functionality is needed
- inserts
- Search
In particular, I don't need to remove pairs or iterate over keys / values / pairs.
Keys are whole tuples, values are pointers (links, whatever). I only store a couple million pairs distributed over (many) objects.
I am currently considering using
- hash table
- a kd-tree
- a b-tree
I'm leaning towards a hash table (for insert / lookup times O(1)
), but I wanted to confirm my leanings.
Which structure (higher or higher) would you recommend and why? If you would recommend a hash table, should I create a separate table for each object, or just create a separate table and use the object id as part of the key tuple?
A hash table would be the best choice here, as all the operations that matter to you are O (1) (and therefore you don't have to worry about creating multiple hash tables).
I am a big fan of hash tables as they are lightweight and there are implementations available for almost all major languages. Especially interesting is the O (1) insert / search function.
You should probably use a single table to keep in memory. Hash tables are notoriously memory inefficient, and using a single table will help to minimize this.
Hash tables would be helpful here and I see no reason to have more than one table.
Most trees have O (n ln n) lookup times, but hashtables have O (1) lookup times, so that's the one you want to use. It is also very common and often the implementation is optimized for loading.