MongoDB oplog has entries with dots in key names that cannot be queried, afaict

Considering: Mongo allows you to set nested fields using "dots", for example:

rs0:PRIMARY> db.tmp.update({ a: 1 }, { $set: { 'b.c': 2 } }, { upsert: true })
rs0:PRIMARY> db.tmp.findOne()
{
    "_id" : ObjectId("558251c6a3354af70d70f3cc"),
    "a" : 1,
    "b" : {
        "c" : 2
    }
}

      

In this example, the entry was created with upsert, which I can check in the oplog:

rs0:PRIMARY> use local
rs0:PRIMARY> db.oplog.rs.find().sort({ts:-1}).limit(1).pretty()
{
    "ts" : Timestamp(1434603974, 2),
    "h" : NumberLong("2071516013149720999"),
    "v" : 2,
    "op" : "i",
    "ns" : "test.tmp",
    "o" : {
        "_id" : ObjectId("558251c6a3354af70d70f3cc"),
        "a" : 1,
        "b" : {
            "c" : 2
        }
    }
}

      

When I do the same and the entry is just updated rather than created, I seem to get the same behavior:

rs0:PRIMARY> db.tmp.update({ a: 1 }, { $set: { 'b.d': 3 } }, { upsert: true })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
rs0:PRIMARY> db.tmp.findOne()
{
    "_id" : ObjectId("558251c6a3354af70d70f3cc"),
    "a" : 1,
    "b" : {
        "c" : 2,
        "d" : 3
    }
}

      

However , this time the oplog entry is structured differently:

rs0:PRIMARY> use local
rs0:PRIMARY> db.oplog.rs.find().sort({ts:-1}).limit(1).pretty()
{
    "ts" : Timestamp(1434604173, 1),
    "h" : NumberLong("-4353495487634403370"),
    "v" : 2,
    "op" : "u",
    "ns" : "test.tmp",
    "o2" : {
        "_id" : ObjectId("558251c6a3354af70d70f3cc")
    },
    "o" : {
        "$set" : {
            "b.d" : 3
        }
    }
}

      

(Note the key "b.d"

).

This is causing me problems because I am trying to investigate some of the dropped updates by checking the corresponding oplog entries, but AFAICT there is no way to query the oplog entries that set specific nested fields:

rs0:PRIMARY> db.oplog.rs.findOne({ 'o.$set.b.d': { $exists: true } })
null

      

Is there a way to query the oplog for entries related to updates to a specific subfield (in this case b.d

)?

It seems that I am running into Mongo's inconsistent application of disallowing dots in field names : on the one hand, I cannot create (via the official clients / directly in the Mongo shell) or query for them, but on the other hand it creates them in the oplog, leaving unloadable oplog entries.

Any help would be much appreciated.

For completeness, please note that I can successfully query oplog entries with keys that include the bit $set

:

rs0:PRIMARY> db.tmp.update({ a: 1 }, { $set: { e: 4 } }, { upsert: true })
rs0:PRIMARY> use local
rs0:PRIMARY> db.oplog.rs.findOne({ 'o.$set.e': { $exists: true } })
{
    "ts" : Timestamp(1434604486, 1),
    "h" : NumberLong("1819316318253662899"),
    "v" : 2,
    "op" : "u",
    "ns" : "test.tmp",
    "o2" : {
        "_id" : ObjectId("558251c6a3354af70d70f3cc")
    },
    "o" : {
        "$set" : {
            "e" : 4
        }
    }
}

      

+3


source to share


2 answers


You are correct, there is some inconsistency in MongoDB's oplog implementation that allows a document format for each activity log, which technically does not allow such a document to be queried appropriately.

Even inserting the same record is not possible since it has the field name $ set:



db.tmp2.insert({ 
    "ts" : Timestamp(1450117240, 1), 
    "h" : NumberLong(2523649590228245285), 
    "v" : NumberInt(2), 
    "op" : "u", 
    "ns" : "test.tmp", 
    "o2" : {
        "_id" : ObjectId("566f069e63d6a355b2c446af")
    }, 
    "o" : {
        "$set" : {
            "b.d" : NumberInt(4)
        }
    }
})

2015-12-14T10:27:04.616-0800 E QUERY    Error: field names cannot start with $ [$set]
    at Error (<anonymous>)
    at DBCollection._validateForStorage (src/mongo/shell/collection.js:161:19)
    at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
    at insert (src/mongo/shell/bulk_api.js:646:20)
    at DBCollection.insert (src/mongo/shell/collection.js:243:18)
    at (shell):1:9 at src/mongo/shell/collection.js:161
      

Run codeHide result


and bd is not valid for key



db.tmp.update({ a: 1 }, { $set: { 'b.d': 4 } }, { upsert: true })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

db.oplog.rs.find()


db.tmp2.insert({ 
    "ts" : Timestamp(1450117240, 1), 
    "h" : NumberLong(2523649590228245285), 
    "v" : NumberInt(2), 
    "op" : "u", 
    "ns" : "test.tmp", 
    "o2" : {
        "_id" : ObjectId("566f069e63d6a355b2c446af")
    }, 
    "o" : {
        "set" : {
            "b.d" : NumberInt(4)
        }
    }
})

2015-12-14T10:23:26.491-0800 E QUERY    Error: can't have . in field names [b.d]
    at Error (<anonymous>)
    at DBCollection._validateForStorage (src/mongo/shell/collection.js:157:19)
    at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
    at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
    at insert (src/mongo/shell/bulk_api.js:646:20)
    at DBCollection.insert (src/mongo/shell/collection.js:243:18)
    at (shell):1:9 at src/mongo/shell/collection.js:157
      

Run codeHide result


There might be a Jira issue that recommends using the syntax with $ set search as a value:



{ 
    "ts" : Timestamp(1450117240, 1), 
    "h" : NumberLong(2523649590228245285), 
    "v" : NumberInt(2), 
    "op" : "u", 
    "ns" : "test.tmp", 
    "o2" : {
        "_id" : ObjectId("566f069e63d6a355b2c446af")
    }, 
    "o" : {
        "$set" : {
            "key" : "b.d"
            "value" : NumberInt(4)
        }
    }
}
      

Run codeHide result


Update: There was a problem for Jira:

https://jira.mongodb.org/browse/SERVER-21889

+3


source


While it is true that you cannot query "bd" directly through find, there are workarounds for the problem (since you are trying to do this for debugging purposes, the workaround will allow you to find all entries that match the format of the update you want).

edit See the bottom of the answer for a workaround.

Use mapReduce

to output ts (Timestamp) values ​​from records oplog

you want to match:

map = function() {
    for (i in this.o.$set) 
       if (i=="b.d") emit(this.ts, 1);
}

reduce = function(k, v) { return v; }

db.oplog.rs.mapReduce(map,reduce,{out:{inline:1},query:{op:"u","o.$set":{$exists:true}}})
{
  "results" : [
    {
        "_id" : Timestamp(1406409018, 1),
        "value" : 1
    },
    {
        "_id" : Timestamp(1406409030, 1),
        "value" : 1
    },
    {
        "_id" : Timestamp(1406409042, 1),
        "value" : 1
    },
    {
        "_id" : Timestamp(1406409053, 1),
        "value" : 1
    }
  ],
  "timeMillis" : 117,
  "counts" : {
    "input" : 9,
    "emit" : 4,
    "reduce" : 0,
    "output" : 4
  },
  "ok" : 1
}

db.oplog.rs.find({ts:{$in:[Timestamp(1406409018, 1), Timestamp(1406409030, 1), Timestamp(1406409042, 1), Timestamp(1406409053, 1)]}})
< your results if any here >

      

In the map function, replace "bd" with the desired dotfield name.

If you want to get fancy, you can map

constant and emit the document "$in"

, then use that in your request (same result, slightly different format):



map2=function () {
   for (i in this.o.$set)
      if (i=="b.d") emit(1, {"$in": [ this.ts ]});
}
reduce2=function (k, v) {
   result={"$in": [ ] };
   v.forEach(function(val) {
      val.$in.forEach(function(ts) {
          result.$in.push(ts);
      });
   });
   return result;
}

      

I can run this version in a shell and I get something like this in my test data:

tojsononeline(db.oplog.rs.mapReduce(map2, reduce2, { out:{inline:1}, query:{op:"u","o.$set":{$exists:true}}}).results[0].value)
{  "$in" : [ Timestamp(1406409042, 1), Timestamp(1406409018, 1), Timestamp(1406409030, 1), Timestamp(1406409053, 1) ] }

      

EDIT . It turns out there is also a way to run a query directly through the aggregation framework:

db.oplog.rs.aggregate( [
    {$match:{"o.$set":{$exists:true}}},
    {$project: { doc:"$$ROOT", 
                 matchMe:{$eq:["$o",{$literal:{$set:{"b.d":1 }}}]}
    }},
    {$match:{matchMe:true}}
] ).pretty()
< your matching records if any >

      

+1


source







All Articles