How does OpenCV's SiftDescriptorExtractor convert descriptor values?

I have a question about the last part of the SiftDescriptorExtractor job,

I am doing the following:

    SiftDescriptorExtractor extractor;
    Mat descriptors_object;
    extractor.compute( img_object, keypoints_object, descriptors_object );

      

Now I want to check the elements of the descriptors_object Mat:

std::cout<< descriptors_object.row(1) << std::endl;

      

the output looks like this:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 0, 0, 32, 15, 0, 0, 0, 0, 0, 0, 73, 33, 11, 0, 0, 0, 0, 0, 0, 5, 114, 1, 0, 0, 0, 0, 51, 154, 20, 0, 0, 0, 0, 0, 154, 154, 1, 2, 1, 0, 0, 0, 154, 148, 18, 1, 0, 0, 0, 0, 0, 2, 154, 61, 0, 0, 0, 0, 5, 60, 154, 30, 0, 0, 0, 0, 34, 70, 6, 15, 3, 2, 1, 0, 14, 16, 2, 0, 0, 0, 0, 0, 0, 0, 154, 84, 0, 0, 0, 0, 0, 0, 154, 64, 0, 0, 0, 0, 0, 0, 6, 6, 1, 0, 1, 0, 0, 0]

      

But the Lowe paper states that:

Therefore, we reduce the influence of large gradient values, the threshold value of the values ​​per unit vector of functions for each will not be more than 0.2, and then renormalize the unit length. This means that magnitude comparisons for large gradients are no longer as important, and that the distribution of orientations is more emphasized. A value of 0.2 was determined experimentally using images containing different interpretations for the same 3D objects.

Thus, the numbers from the vector function must be no more than 0.2.

The question is, how were these values ​​converted to a Mat object?

+3


source to share


1 answer


Thus, numbers from a vector function must be at most 0.2 values.

Not. The doc says that SIFT descriptors are:

  • normalized (with norm L2)
  • truncated using 0.2

    as the threshold value (i.e. loop over normalized values ​​and truncate as needed)
  • normalized again

So, theoretically, any component of a SIFT descriptor is in between [0, 1]

, although in practice the effective observation range is smaller (see below).

The question is, how were these values ​​converted to a Mat object?

They are converted from floating point values ​​to unsigned char

-s.



Below is the relevant section from the OpenCV method modules/nonfree/src/sift.cpp

calcSIFTDescriptor

:

float nrm2 = 0;
len = d*d*n;
for( k = 0; k < len; k++ )
    nrm2 += dst[k]*dst[k];
float thr = std::sqrt(nrm2)*SIFT_DESCR_MAG_THR;
for( i = 0, nrm2 = 0; i < k; i++ )
{
    float val = std::min(dst[i], thr);
    dst[i] = val;
    nrm2 += val*val;
}
nrm2 = SIFT_INT_DESCR_FCTR/std::max(std::sqrt(nrm2), FLT_EPSILON);
for( k = 0; k < len; k++ )
{
    dst[k] = saturate_cast<uchar>(dst[k]*nrm2);
}

      

FROM

static const float SIFT_INT_DESCR_FCTR = 512.f;

      

This is because classic SIFT implementations quantize normalized floating point values ​​to an unsigned char

integer through a multiplication factor of 512, which is equivalent to considering any SIFT component to vary between [0, 1/2]

and thus avoiding loss of precision when trying to encode the full range [0, 1]

.

+6


source







All Articles