Dedicated / Aggregate Query: Mongodb array, trims trailing space
I have a MongoDB collection that contains an array of colors, for example:
myCollection:
{
_id : ...,
"colours" : [
{
"colourpercentage" : "42",
"colourname" : "Blue"
},
{
"colourpercentage" : "32",
"colourname" : "Red"
},
{
"colourpercentage" : "10",
"colourname" : "Green "
}
]
}
I would like to get every single value for every record in this collection and be able to filter it when I search.
I have tried with a great one with no success. I searched further and found that aggregation can help me. At the moment I have:
db.getCollection('myCollection').aggregate([
{ "$match": { "colours.colourname": /Gre/ } }, # Gre is my search
{ "$unwind": "$colours" },
{ "$match": { "colours.colourname": /search/ } },
{ "$group": {
"_id": "$colours.colourname"
}}
])
It works, but I get an array like:
{
"result" : [
{
"_id" : "Grey"
},
{
"_id" : "Light Green "
},
{
"_id" : "Light Green"
},
{
"_id" : "Green "
},
{
"_id" : "Green"
}
],
"ok" : 1.0000000000000000
}
And I would like to remove duplicate entries that have a space at the end and display them as:
["Grey","Light Green","Green"]
source to share
One approach you can use is Map-Reduce , although the JavaScript interpreter is driven by mapReduce takes a little more than the aggregation framework, but will work as you will be using some very useful JavaScript features that are missing from the aggregation framework. For example, in a map function, you can use a function to remove any trailing spaces in so that you can emit "cleared" keys. trim()
colourname
A Map-Reduce operation usually has the following map and reduces features:
var map = function() {
if (!this.colours) return;
this.colours.forEach(function (c){
emit(c.colourname.trim(), 1)
});
};
var reduce = function(key, values) {
var count = 0;
for (index in values) {
count += values[index];
}
return count;
};
db.runCommand( { mapreduce : "myCollection", map : map , reduce : reduce , out : "map_reduce_result" } );
Then you can query the collection map_reduce_result
with a regular expression to get the result:
var getDistinctKeys = function (doc) { return doc._id };
var result = db.map_reduce_result.find({ "_id": /Gre/ }).map(getDistinctKeys);
print(result); // prints ["Green", "Grey", "Light Green"]
- UPDATE -
To implement this in Python, the PyMongo API supports all MongoDBs map functionality / shrink engine this way you can try this:
import pymongo
import re
from bson.code import Code
client = pymongo.MongoClient("localhost", 27017)
db = client.test
map = Code("function () {"
" if (!this.colours) return;"
" this.colours.forEach(function (c){"
" emit(c.colourname.trim(), 1)"
" });"
"};")
reduce = Code("function (key, values) {"
" var count = 0;"
" for (index in values) {"
" count += values[index];"
" }"
" return count;"
" };")
result = db.myCollection.map_reduce(map, reduce, "map_reduce_result")
regx = re.compile("Gre", re.IGNORECASE)
for doc in result.find({"_id": regx}):
print(doc)
source to share