How to combine a heterogeneous array into one document in MongoDb?

MongoDb of my site keeps a separate document for each user. Each user will answer several questionnaires during their visit. The forms are stored in an array, but since the documents do not overlap, a single document is sufficient. For analysis, I want to create a flat table of all responses across all forms.

Consider the following data structure:

{
    "USER_SESSION_ID": 456,
    "forms": [
        {
            "age": 21,
            "gender": "m"
        },
        {
            "job": "Student",
            "years_on_job": "12"
        },
        {
            "Hobby": "Hiking",
            "Twitter": "@my_account"
        }
    ]
},
{
    "USER_SESSION_ID": 678,
    "forms": [
        {
            "age": 46,
            "gender": "f"
        },
        {
            "job": "Bodyguard",
            "years_on_job": "2"
        },
        {
            "Hobby": "Skiing",
            "Twitter": "@bodyguard"
        }
    ]
}

      

All form documents look different and have no conflicting margins, so I would like to combine them to get a tabular, flat structure like this:

{ 'USER_SESSION_ID': 456, 'age': 21, 'gender': 'm', 'job': 'Student', ... 'Twitter': '@my_account' }
{ 'USER_SESSION_ID': 678, 'age': 46, 'gender': 'f', 'job': 'Bodyguard',  ... 'Twitter': '@bodyguard' }

      

Using Python this is a complete flaw that looks like this:

for session in sessions:          # Iterate all docs
    for form in session['forms']: # Iterate all children
        session.update(form)      # Integrate to parent doc
    del session['forms']          # Remove nested child

      

In MongoDb, I find this quite difficult to achieve. I am trying to use an aggregate pipeline, which I suppose should be suitable for this.

So far I've been helping myself by unwinding my data structure like:

db.sessions.aggregate(
    {
        '$unwind': '$forms'
    },
    { 
        '$project': {
            'USER_SESSION_ID': true,
            'forms': true
        }
    },
    {
        '$group': {
            '_id': '$USER_SESSION_ID',
            'forms': <magic?!>
        }
    }
)

      

In the spreading phase, I create a document with parent data for each child element. This should be roughly equivalent to a double-for loop in my python code. However, what I feel like I'm conceptually missing is the Merge accumulator when grouped. In python this is done with dict.update()

, in underscore.js it will _.extend(destination, *sources)

.

How can I achieve this in MongoDB?

+3


source to share


2 answers


I played around with the aggregate pipeline for ages until I gave mapReduce a try. This is what I came up with:

db.sessions.mapReduce(
    function () {
        var merged = {};
        this.forms.forEach(function (form) {
            for(var key in form) {
                merged[key] = form[key];
            }
        });
        emit(this.USER_SESSION_ID, merged);
    },
    function () {},
    {
         "out": {"inline": true}
    }
)

      



The matching step is merging items because there is no single $ merge operator available as a step in the aggregation pipeline. An empty reduce

function is required . out

either writes to another collection, or just returns the result (inline, which is what I'm doing here).

It's very similar to the method chridam showed in his answer, but it actually uses projection. Its version is much closer to how my Python code works, but for what I am trying to do it is fine and does not change the original set. Note that the python code does this, but not interleaving the input collection is quite useful!

+1


source


Try the following, which uses nested calls to iterate over the cursor's result and get the object keys for the elements in the array using : forEach()

find()

forms

Object.keys()



db.sessions.find().forEach(function (doc){
    doc.forms.forEach(function (e){ 
        var keys = Object.keys(e); 
        keys.forEach(function(key){ doc[key] = e[key] });
    });
    delete doc.forms;
    db.sessions.save(doc);
});

      

+1


source







All Articles