How to deal with relationships when using mongodb

I know, I think in a "denormalized way" or "nosql way".

but tell me about this simple case.



some user post a comment and I want to get some user data by getting the comment. let's say I want to show dynamic data like "userlevel" and static data like "username".

I will never have problems with static data, but what about dynamic data?

userlevel is sorting of users, I need denormalized data, duplicated in comment, to improve read performance, but also with user level update.

is it archivable in some way?


source to share

1 answer


Just found an answer by Brendan McAdams, a guy at 10gen who is obviously a capable authoritative than me and he recommends embedding docs.

old text:

The first is to manually include the ObjectID of the user they belong to in each comment.

comment: { text : "...", 
           date: "...", 
           user: ObjectId("4b866f08234ae01d21d89604"),
           votes: 7 }


The second and clever way is to use DBRefs

we add extra I / O to our disk, losing performance, am I right? (I'm not sure how it works internally) so we need to avoid linking if possible, right?

Yes - there will be another request, but the driver will do it for you - you can think of it as syntactic sugar. Does this affect performance? Actually, it depends too :) One of the reasons why Mongo is so freaking fast is that it uses memory mapped files and mongo is best to keep the entire working set (plus indexes) directly in RAM. And every 60 seconds (by default) it syncs a RAM snapshot with a file on disk.
When I talk about working set I mean what you are working with: you can have three collections - foo , bar , baz, but if you are currently working only with foo and bar, they should be loaded into ram, and baz remains abandoned on disk. Moreover, memory mapped files only allow loading part of a collection. So if you are building something like engadget or techcrunch, there is a high chance that the working set will be comments in the last few days, and the old pages will be revived less often (comments will appear in memory on demand), so it doesn't "t affect on performance.

So, repeat: as long as you keep working in memory (you might think this is read / write caching) fetching these things is super fast and another query won't be a problem. If you are working with chunks of data that don't fit into memory, there will be a degradation rate , but I'm not right now your circumstances - this may be acceptable, so in both cases I prefer to use a reference.



All Articles