How to create a dataset for object detection in a cafe?

Creating a database ( LMDB/LEVELDB

) for images is trivial in caffe. But how do we create such a dataset for object detection?
Is this sequence correct?

  • put all images in a folder
  • for each image, create a text file with the same name of the corresponding image *
  • Place the coordinates of the bounding box for each object in the image on a separate line

Now, how to convert such a structure to lmdb?
Should I convert all txt files to bytes and store the entire byte stream as one shortcut for each image?
Will caffe be able to automatically read from such a convertible database, or should I create a certain layer to read and feed the necessary information to the web?

+3


source to share


1 answer


You need to create a custom layer to handle additional data to be included in the lmdb file, you can take a look at the already implemented Fast-RCNN in caffe that does end-to-end detection on this page: https://github.com/rbgirshick/py- faster-rcnn / tree / master / models / coco / VGG_CNN_M_1024 / faster_rcnn_end2end .

Looking at the input layer in the prototype file, you can see that a custom type is used for input:

layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
  module: 'roi_data_layer.layer'
  layer: 'RoIDataLayer'
  param_str: "'num_classes': 81"
 }

      



}

Also, you can see the details of this custom layer here: https://github.com/rbgirshick/fast-rcnn/tree/master/lib/roi_data_layer

+2


source







All Articles