Caching Mongodb Aggregation Results
I have a fairly large collection in mongodb with about 100,000 documents (not plastered). It is a backend for a web application that basically allows the user to view different ways of viewing the same information in this collection.
For one of the views, I am trying to count the number of occurrences of a field using an aggregation framework. This means uniting the entire collection. The problem is that this aggregation operation (which is a simple pipeline of group, sort and limit) takes 2 seconds, which is too slow for a web application.
So my question is; what is the preferred solution for caching the result of this aggregation operation? As far as I have found, it is not possible to "populate" a new collection or something. For now, the only solution I have found is to read the entire result into a variable and then insert that variable into a new collection using insert - but I'm afraid this has to do with sending a lot of data from the database => to my application = > go back to the database?
Any suggestions?
Pipeline example:
res = items.aggregate([
{ "$group": { "_id": { "item_id": "$item_id", "title": "$title", "category": "$category" }, "count": { "$sum": 1 } } },
{ "$sort": { "count": -1 } },
{ "$limit": 5 }
])
The scheme is basically those 3 fields + a few more that are really not relevant, for example:
doc = {
"item_id": 1000,
"title": "this is the item title",
"category": "this is the item category"
}
I have tried indexing both item_id and all 3 fields with no success.
source to share
Aggregation returns the result to a single document. Result is capped at 16M. The document is returned to the application.
If you want to "fill" a collection, use map-reduce.
map_function = function () {
emit(this.item_id, {"item_id": this.item_id, /* any other info */ "count": 1});
};
reduce_function = function (key, values) {
var result = {"item_id": key, /* any other info should be given from one or any of values array objects */ "count": 0};
values.forEach(function (value) {
result["count"] += value["count"];
});
return result;
};
Not sure if you can emit structural values - try it. BTW the radiant key field is good.
source to share