MongoDB: grouped subdocument and count + add general account
What i want to achieve
Suppose you have the following subdocument:
{
"id":1,
"url":"mysite.com",
"views":
[
{"ip":"1.1.1.1","date":"01-01-2015"},
{"ip":"2.2.2.2","date":"01-01-2015"},
{"ip":"1.1.1.1","date":"01-01-2015"},
{"ip":"1.1.1.1","date":"01-01-2015"}
]
}
I want to count:
- how many IP addresses exist based on the value
"ip"
- and also count the total number of subdocuments in
"views"
if possible in the same query to achieve the following result:
[
{
"_id":"2.2.2.2",
"count":1
},
{
"_id":"1.1.1.1",
"count":3
},
{
"_id":"total",
"count":4
}
]
What I have achieved so far
Using the MongoDB Aggregation Framework I was able to reach point 1. as follows:
db.collection.aggregate([
{
"$unwind": "$views"
},
{
"$group": {
"_id": "$views.ip",
"count": {
"$sum": 1
}
}
}
])
which returns:
[
{
"_id":"2.2.2.2",
"count":1
},
{
"_id":"1.1.1.1",
"count":3
}
]
I want to return this additional document inside an array, which will be:
{
"_id":"total",
"count":4
}
to achieve what I revealed above, but I got stuck there and couldn't do it.
source to share
Impossible within the same aggregation pipeline, because in principle the pipeline processes documents as they pass through it, that is, the stages of the pipeline do not need to create one output document for each input document; for example, some stages can create new documents or filter documents. In the above scenario, adding one more step to get the grouped IP counts + total score will result in a different result to what you are after $group
db.collection.aggregate([
{
"$unwind": "$views"
},
{
"$group": {
"_id": "$views.ip",
"count": {
"$sum": 1
}
}
},
{
"$group": {
"_id": null,
"total": {
"$sum": "$count"
}
}
}
])
You will only get the total as it consumes all input documents (documents are grouped by IP) and outputs one document to each individual group. This additional group step will group all documents from the previous stream. $group
However, you can get the total, but as an additional field in each grouped document in your final result. The following example, which uses an initial step to get the total score across, does the following: $project
$size
db.collection.aggregate([
{
"$project": {
"views": 1,
"views_size": { "$size": "$views" }
}
}
{
"$unwind": "$views"
},
{
"$group": {
"_id": "$views.ip",
"count": {
"$sum": 1
},
"total": { "$first": "$views_size" }
}
}
])
Output result
[
{
"_id": "2.2.2.2",
"count": 1,
"total": 4
},
{
"_id": "1.1.1.1",
"count": 3,
"total": 4
}
]
source to share