Is dense SIFT better for Bag-Of-Words than SIFT?

Question

Is dense SIFT better for Bag-Of-Words than SIFT?

I am using the Bag-of-Words Image Classifier using OpenCV. I originally tested the SURF descriptors extracted at SURF cue points. I heard that Dense SIFT (or PHOW) descriptors might work better for my purposes, so I tried them too.

To my surprise, they performed significantly worse, in fact almost 10 times worse. What could I be doing wrong? I am using DenseFeatureDetector from OpenCV to get keypoints. I am extracting about 5000 descriptors per image from 9 layers and clustering them into 500 clusters.

Should I be using PHOW descriptors from the VLFeat library? Also I cannot use the chi core in OpenCV SVM, which is recommended in many docs. Is this critical to the quality of the classifier, should I try another library?

Scale invariance is another issue, I suspect it could be affected by dense extraction. I'm right?

+3

opencv computer-vision classification feature-extraction

lizarisk 05 Feb At 14:38

source to share

1 answer

min.yong.yoon · Accepted Answer · 2013-02-05T15:31:49+0000

It depends on the problem. You should try different methods to find out which method is best to use in your problem. Usually using PHOW is very useful when you need to classify any scene. You should be aware that PHOW is slightly different from Dense SIFT. I used vlfeat PHOW a few years ago and after seeing the code it just calls it dense sieving with different sizes and some anti-aliasing. This may be one key to being scale invariant. Also in my experiments I used libsvm and this resulted in the bar chart intersection being the best for me. By default, chi-square and histogram intersection kernels are not included in libsvm and OpenCV SVM (based on libsvm). You decide if you should try them. I can tell you that the RBF core has reached almost 90% accuracy,Intersection of histograms 93% and Chi-square 91%. But these results were in my specific experiments. You should start with RBF with auto-tuned parameters and see if that is enough.

To summarize, it all depends on your specific experiments. But if you are using Dense SIFT, perhaps you can try to increase the number of clusters and call Dense SIFT with different scales (I recommend the PHOW path for you).

EDIT: I was looking at OpenCV DenseSift and maybe you could start with

m_detector=new DenseFeatureDetector(4, 4, 1.5);

Knowing that vwfeat PHOW uses [4 6 8 10] as the bunker dimensions.

Is dense SIFT better for Bag-Of-Words than SIFT?

More articles: