How to create a dataset for object detection in a cafe?
Creating a database ( LMDB/LEVELDB
) for images is trivial in caffe. But how do we create such a dataset for object detection?
Is this sequence correct?
- put all images in a folder
- for each image, create a text file with the same name of the corresponding image *
- Place the coordinates of the bounding box for each object in the image on a separate line
Now, how to convert such a structure to lmdb?
Should I convert all txt files to bytes and store the entire byte stream as one shortcut for each image?
Will caffe be able to automatically read from such a convertible database, or should I create a certain layer to read and feed the necessary information to the web?
source to share
You need to create a custom layer to handle additional data to be included in the lmdb file, you can take a look at the already implemented Fast-RCNN in caffe that does end-to-end detection on this page: https://github.com/rbgirshick/py- faster-rcnn / tree / master / models / coco / VGG_CNN_M_1024 / faster_rcnn_end2end .
Looking at the input layer in the prototype file, you can see that a custom type is used for input:
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 81"
}
}
Also, you can see the details of this custom layer here: https://github.com/rbgirshick/fast-rcnn/tree/master/lib/roi_data_layer
source to share