MongoDb: How to get field (subdocument) from document?

Consider this example collection:

 {
    "_id:"0,
    "firstname":"Tom",
    "children" : {
                    "childA":{
                                "toys":{
                                        'toy 1':'batman',
                                        'toy 2':'car',
                                        'toy 3':'train',
                                        }
                                "movies": {
                                        'movie 1': "Ironman"
                                        'movie 2': "Deathwish"
                                        }
                                },
                    "childB":{
                                "toys":{
                                        'toy 1':'doll',
                                        'toy 2':'bike',
                                        'toy 3':'xbox',
                                        }
                                "movies": {
                                        'movie 1': "Frozen"
                                        'movie 2': "Barbie"
                                        }
                                }
                    }
}

      

Now I would like to get ONLY movies from a specific document.

I've tried something like this:

movies = users.find_one({'_id': 0}, {'_id': 0, 'children.ChildA.movies': 1})

      

However, I am getting the entire structure of the field from "children" to "movies" and its contents. How easy is it to query and get only the content of the "movies"?

To be specific, I want to end up with this:

                                       {
                                        'movie 1': "Frozen"
                                        'movie 2': "Barbie"
                                        }

      

+3


source to share


2 answers


The problem is that your current data structure is not very queryable. This is mainly because you are using "keys" to actually represent "data points", and while this may seem like a logical idea at first glance, it is actually very bad practice.

So instead of doing something like assigning "childA" and "childB" as object or "subdocument" keys, you'd better assign their "values" to the common key name in a structure like this:

 {
    "_id:"0,
    "firstname":"Tom",
    "children" : [
        { 
            "name": "childA", 
            "toys": [
                "batman",
                "car",
                "train"
            ],
            "movies": [
                "Ironman"
                "Deathwish"
            ]
        },
        {
            "name": "childB",
            "toys": [
                "doll",
                "bike",
                "xbox",
            ],
            "movies": [
                "Frozen",
                "Barbie"
            ]
        }
    ]
}

      

Not the best as there are nested arrays which can be a potential problem, but there are workarounds to this (but later), but the main point here is that it is much better than defining data in "keys". And the main problem with "keys "which have not been named consistently is that MongoDB usually does not allow wildcarding these names in any way, so you get stuck with names and an" absolute path "for accessing elements, as in:

children → childA → toys
   children → childB → toys

And that's bad in a nutshell and compares to this:

"children.toys"

      

From the example above, I would say this is a much better approach to organizing your data.

However, simply returning something like a "unique movie list" is outside the scope of standard .find()

MongoDB type queries . It actually requires something more in "document manipulation" and is well supported in the aggregation framework for MongoDB. This has extensive manipulation capabilities that are not available in query methods, and as a response to a single document with a specified structure, you can do this:

db.collection.aggregate([
    # De-normalize the array content first
    { "$unwind": "$children" },

    # De-normalize the content from the inner array as well
    { "$unwind": "$children.movies" },

    # Group back, well optionally, but just the "movies" per document
    { "$group": {
        "_id": "$_id",
        "movies": { "$addToSet": "$children.movies" }
    }}
])

      

So now the answer "list" in the document contains only "unique" films that are more in line with what you are asking for. Alternatively, you could simply create a "unique" list. But it's silly that it is actually the same: $push

db.collection.find({},{ "_id": False, "children.movies": True })

      

As a collection wide concept, you could simplify this simply by using the .distinct()

. Which basically forms a list of "different" keys based on the input you enter. This works well with arrays:



db.collection.distinct("children.toys")

      

And it is essentially a collection of broad analyzes of all the "different" cases for each value of the "toys" in the collection and is returned as a simple "array".


But as far as the existing structure is concerned, it deserves a solution to be explained, but you really have to understand that the explanation is terrible. The problem here is that the "native" and optimized methods available for common queries and aggregation methods are not available at all, and the only available option is JavaScript-based processing. Which, while slightly better thanks to the "v8" engine integration, is still a complete slouch when compared against each other using its own code methods.

So, from the "original" form you have (JavaScript form should be so easy to translate):

 db.collection.mapReduce(
     // Mapper
     function() {
         var id this._id;
             children = this.children;

         Object.keys(children).forEach(function(child) {
             Object.keys(child).forEach(function(childKey) {
                 Object.keys(childKey).forEach(function(toy) {
                     emit(
                         id, { "toys": [children[childkey]["toys"][toy]] }
                     );
                 });
             });
         });
     },
     // Reducer
     function(key,values) {
         var output = { "toys": [] };

         values.forEach(function(value) {
             value.toys.forEach(function(toy) {
                 if ( ouput.toys.indexOf( toy ) == -1 )
                     output.toys.push( toy );
             });
         });
     },
     {
         "out": { "inline": 1 }
     }
)

      

So JavaScript evaluation is a "terrible" approach as it is much slower to execute and you see "walkthrough" code that needs to be implemented. Bad news for performance, so don't do it. Change the structure instead.


As a final part, you can model this differently to avoid the "nested array" concept. And understand that the real problem is only with the "nested array" is that the "update" sub-element is really impossible without reading the entire document and its changes.

The values and are working fine. But using the "positional" operator just doesn't work, because the "outer" array index is always the "first" matched element. So if this was really a problem for you, you could do something like this, for example: $push

$pull

$

 {
    "_id:"0,
    "firstname":"Tom",
    "childtoys" : [
        { 
            "name": "childA", 
            "toy": "batman"
        }.
        { 
            "name": "childA",
            "toy": "car"
        },
        {
            "name": "childA",
            "toy": "train"
        },
        {
            "name": "childB",
            "toy": "doll"
        },
        {
            "name": "childB",
            "toy": "bike"
        },
        {
            "name": "childB",
            "toy": "xbox"
        }
    ],
    "childMovies": [
        {
             "name": "childA"
             "movie": "Ironman"
       },
       {
            "name": "childA",
            "movie": "Deathwish"
       },
       {
            "name": "childB",
            "movie": "Frozen"
       },
       {
            "name": "childB",
            "movie": "Barbie"
       }
  ]
}

      

This will be one way to avoid the nested update problem if you really need to "update" items on a regular basis, rather than just $ push and $ pull items into the toys and movies arrays.

But the general message here is to design your data around the access patterns you are actually using. MongoDB generally dislikes things with a strict path in that they can query or otherwise flexibly release updates.

+3


source


MongoDB projections use "1" and "0", not "True" / "False". Also, make sure the fields are in the correct cases (upper and lower case)

The request should be as follows:

db.users.findOne({'_id': 0}, {'_id': 0, 'children.childA.movies': 1})

      



This will lead to:

{
    "children" : {
        "childA" : {
            "movies" : {
                "movie 1" : "Ironman",
                "movie 2" : "Deathwish"
            }
        }
    }
}

      

0


source







All Articles