Find mongodb document using partial _id string
I need to find one or more documents in a collection that have a specific string in their _id field.
this turns out to be a problem since the _id field is an object and not a string, so I can't just set it.
for example, let's say I have these docs with these _id:
54060b811e8e813c55000058
54060e9e1e8e813c55000059
540738e082fa085e5f000015
and I want to search for "00005" then the result should be
54060b811e8e813c55000058 54060e9e1e8e813c55000059
Is there anyway to accomplish this?
I need this for a jQuery datatables implementation that uses server side processing from php.
this means that I need to add something to this part of the code:
if ( !empty($input['sSearch']) ) {
$sSearch = $input['sSearch'];
for ( $i=0 ; $i < $iColumns ; $i++ ) {
if ($input['bSearchable_'.$i] == 'true') {
if ($input['bRegex'] == 'true') {
$sRegex = str_replace('/', '\/', $sSearch);
} else {
$sRegex = preg_quote($sSearch, '/');
}
$searchTermsAny[] = array(
$dataProps[$i] => new MongoRegex( '/'.$sRegex.'/i' )
);
}
}
}
any advice will be abrived
UPDATE:
Thanks to saj, it seems like it is possible to find elements by partial _id using the $ where clause something like this:
$where: "this._id.toString().match(/pattern/i)"
I tried to add this to php code like this:
if ( !empty($input['sSearch']) ) {
$sSearch = $input['sSearch'];
for ( $i=0 ; $i < $iColumns ; $i++ ) {
if ($input['bSearchable_'.$i] == 'true') {
if ($input['bRegex'] == 'true') {
$sRegex = str_replace('/', '\/', $sSearch);
} else {
$sRegex = preg_quote($sSearch, '/');
}
$searchTermsAny[] = array(
$dataProps[$i] => new MongoRegex( '/'.$sRegex.'/i',
'$where: "this._id.toString().match(/'.$sRegex.'/i)"' )
);
}
}
}
however now every query returns all records, not just those suppsoed matches.
any ideas?
DECISION:
thanks to your help i figured it out, in order to add open search in _id field i need to add $ where clause in $ or query array section.
In my situation, I used the following code:
if ( !empty($input['sSearch']) ) {
$sSearch = $input['sSearch'];
for ( $i=0 ; $i < $iColumns ; $i++ ) {
if ($input['bSearchable_'.$i] == 'true') {
if ($input['bRegex'] == 'true') {
$sRegex = str_replace('/', '\/', $sSearch);
} else {
$sRegex = preg_quote($sSearch, '/');
}
$searchTermsAny[] = array(
$dataProps[$i] => new MongoRegex( '/'.$sRegex.'/i')
);
}
}
// add this line for string search inside the _id field
$searchTermsAny[]['$where'] = "this._id.str.match(/$sSearch/)";
}
Thank you for your help :)
As far as performance goes, I agree that this is the WRONG way to go and I will make sure to add an added string field with _id in it to make the performance much better, but for now, at least I have a working solution to the problem.
source to share
$regex
and MongoRegex (that is, the BSON regex type used in equality matching) only support string matching, so you cannot use them directly with ObjectId.
Regarding your last code example, you tried using $where
in the MongoRegex constructor:
$searchTermsAny[] = array(
$dataProps[$i] => new MongoRegex( '/'.$sRegex.'/i',
'$where: "this._id.toString().match(/'.$sRegex.'/i)"' )
);
The MongoRegex constructor takes one string (for example /foo/i
), from which it gets the template and flags. $where
is intended to be used as a top-level query operator (not associated with any field name). I disagree with what you are doing with $dataProps[$i]
, but suppose you were building one query $where
to match the string representation of the ObjectId. The request document will look like this:
{ $where: 'this._id.str.match(/00005/)' }
Note that I am accessing the property str
here instead of calling toString()
. This is because it toString()
actually returns the ObjectId wrapper view. You can see this by checking its source in the shell:
> x = new ObjectId()
ObjectId("5409ddcfd95d6f6a2eb33e7f")
> x.toString
function (){
return "ObjectId(" + tojson(this.str) + ")";
}
Also, if you are just checking if a substring exists in hex _id
, you can use indexOf()
(with matching != -1
) instead of match()
with regex.
However, usage is $where
usually bad unless you combine it with additional query criteria that the index might use. This is due to what the $where
JavaScript interpreter calls for each document considered in the result set. If you combine it with other, more selective criteria, MongoDB can use the index and narrow down the documents that need to be evaluated with $where
; however, you are in a bad time if you use $where
and scan many documents or spreadsheet scans at worst.
You are probably better off creating a second field in each document that contains the hexadecimal representation of the string _id
. Then you can index that field and query it using a regular expression. Queries not tied to regex will still be a bit inefficient (see using the regex index in the docs), but it will still be much faster than using $where
.
This solution (string duplication _id
) will contain some added storage per document, but you might decide that the extra 24-30 bytes (string payload and short field name) are negligible.
source to share