Mongoose updates multiple geospatial indexes with no limits

I have some Mongoose models with geospatial indexes:

var User = new Schema({
  "name" : String,
  "location" : {
     "id" : String,
     "name" : String,
     "loc" : { type : Array, index : '2d'}
  }
});

      

I am trying to update all elements that are in scope - for example:

User.update({ "location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } }, { "foo" : "bar" },{ "multi" : true }, function(err){
    console.log("done!");
});

      

However, this appears to be an update only for the first 100 records. Looking at the docs, there seems to be a native limit find

for geospatial indexes, if that doesn't apply when you don't set a limit.

(from the docs : Use limit () to specify the maximum number of points to return (the default limit of 100 applies if not specified))

This also applies to updates, regardless of the flag multi

, which is a giant drag. If I apply an update, it only updates the first 100.

Right now, the only way I can think of all this is to do something hideous like this:

Model.find({"location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } },{limit:0},function(err,results){
   var ids = results.map(function(r){ return r._id; });
   Model.update({"_id" : { $in : ids }},{"foo":"bar"},{multi:true},function(){
      console.log("I have enjoyed crippling your server.");
   });
});

      

While I'm not even sure what will work (and it could be mildly optimized by just choosing _id), I would really like to avoid storing the n

ids array in memory, as this number can get very large.

Edit: The above hack doesn't even work, looks like find

with {limit:0}

, still returns 100 results. So, as a result of sheer despair and frustration, I wrote a recursive method to paginate using ids and then return them to update using the method above. I added the method as an answer below, but didn't accept it in the hopes that someone would find a better way.

This is a problem in the core of the mongo server as far as I can tell, so mongoose and node-mongodb-native are not to blame. However, this is really silly, as geospatial indexes are one of the few reasons to use mongo on top of other more reliable NoSQL stores.

Is there a way to achieve this? Even in node-mongodb-native or mongo shell, I can't seem to find a way to set (or in this case, uninstall, setting to 0) the update limit.

+3


source to share


1 answer


I would really like to see this problem fixed, but I cannot figure out how to set the limit on the update, and after extensive research, this is not possible. Also, the hack in the question doesn't even work, I still only get 100 records with search and limit

set to 0

.

Until this is fixed in mongo, here's how I get around it: (!! WARNING: HAIKI CORNS FORWARD: !!)

var getIdsPaginated = function(query,batch,callback){
  // set a default batch if it isn't passed.
  if(!callback){
    callback = batch;
    batch = 10000;
  }
  // define our array and a find method we can call recursively.
  var all = [],
      find = function(skip){
        // skip defaults to 0
        skip = skip || 0;
        this.find(query,['_id'],{limit:batch,skip:skip},function(err,items){
          if(err){
            // if an error is thrown, call back with it and how far we got in the array.
            callback(err,all);
          } else if(items && items.length){
            // if we returned any items, grab their ids and put them in the 'all' array
            var ids = items.map(function(i){ return i._id.toString(); });
            all = all.concat(ids);
            // recurse
            find.call(this,skip+batch);
          } else {
            // we have recursed and not returned any ids. This means we have them all.
            callback(err,all);
          }
        }.bind(this));
      };
  // start the recursion
  find.call(this);
}

      



This method will return a giant array of _ids. Since they are already indexed, this is actually pretty fast, but it still calls the db many more times than it needs to. When this method calls back, you can do an update with the IDs like:

Model.update(ids,{'foo':'bar'},{multi:true},function(err){ console.log('hooray, more than 100 records updated.'); });

      

This is not the most elegant way to solve this problem, you can tweak its efficiency by setting batching based on expected results, but obviously being able to just call the update (or find it) on $ near queries without a limit will really help.

0


source







All Articles