MongoDB Index Complexity

I really like MongoDB, I use it both at work and at home, and never once have I encountered a performance, complexity or limitation issue. But I thought a lot about indexes, and I had a question that I could not find an adequate answer to.

One of the big problems with SQL databases at scale is the relative complexity of queries. Specifically, MySQL uses b-trees for most of its indexes, which queries take O (log (n)), better than linear ones, but still mean things take longer than the more data you have.

The big appeal of noSQL databases is removing / mitigating this scaling issue, often relying instead on hash style indexes, which have O (1) lookup times, so having more data doesn't slow down your application. This is where my question comes in:

According to official MongoDB documentation, all indexes in Mongo use b-trees . Even though Mongo does have a hashed index , as far as I can tell, they are still stored in b-trees, the same with the index on the _id field. I couldn't even find anything that pointed to constant time in Mongo's documentation.!

So my question is, are all indexes (including _id and hashed) in Mongo actually stored in b-trees? Does this mean that requesting keys (even for _id) actually takes O (log (n)) time?

Addendum: As a side note, I would be great if the Mongo documentation provided some complexity formulas with example queries. My favorite example of this is the Redis documentation .

Also: It is related . But I have additional specific questions regarding hashed indexes and (more importantly) the _id index.

+3


source to share


1 answer


if you look at the index code in mongodb ( here ) you can easily see that it uses btree for indexing. So the order of the algorithm is O (logN), but the base of the logarithm function is not 2, but 8192, which is in the code here



So, for a million records, we only have two levels (assuming the tree is balanced) and he can quickly find the record. It is generally true that the order is logarithmic, but because the base of the logarithm function is so large, it grows slowly.

+3


source







All Articles