Ratio of positive and negative data for use in training a cascading classifier (opencv)

Question

Ratio of positive and negative data for use in training a cascading classifier (opencv)

So I am using the OpenCV LBP detector. The shapes I find are roughly circular (differing mostly in aspect ratio), with some wide variations in brightness / contrast and a little bit of occlusion.

The OpenCV manual on how to train a detector is here

My main question for anyone using it is how are numPos and numNeg supposed to be in relation to each other? I have approximately 1000 positive samples (so ~ 900 is used for each step)

I need to decide how many negative samples to use for each step of the training. I have about 20,000 images from which negative data can be obtained, so redundancy is not an issue.

Overall the rule I hear is 1: 2, but that seems like an underutilization given how much negative data I have. On the other hand, what consequences should I expect if I train my detector from 1:20? How to determine the correct ratio?

+3

c ++ opencv training-data detection adaboost

user3765410 Dec 11. 14 at 14:34

source to share