I am trying to build a deep learning model for Saliency analysis using caffe (I am using the python wrapper). But I am unable to understand how to generate the lmdb data structure for this purpose. I have gone through the Imagenet and mnist examples and I understand that I should generate labels in the format
my_test_dir/picture-foo.jpg 0
But in my case, I will be labeling each pixel with 0 or 1 indicating whether that pixel is salient or not. That won't be a single label for an image.
How to generate lmdb files for a per pixel based labeling ?
In caffe both lmdb and hdf5 supports multiple labels per image, matrices if you like, see this thread:
https://github.com/BVLC/caffe/issues/1698#issue-53768814
See this tutorial on how to create a multi-label dataset (lmdb here) for caffe with python code:
http://www.kostyaev.me/article/Multilabel%20Dataset/
EDIT: For example for the labels it uses the caffe-python function which converts a 3-dimensional array to datum, found in caffe/python/caffe.io.py: array_to_datum(arr, label=None):
You can approach this problem in two ways:
1. Using HDF5 data layer instead of LMDB. HDF5 is more flexible and can support labels the size of the image. You can see this answer for an example of constructing and using HDF5 input data layer.
2. You can have two LMDB input layers: one for the image and one for the label. Note that when you build the LMDB you must not use the
'shuffle'
option in order to have the images and their labels in sync.Update: I recently gave a more detailed answer here.
Check this one: http://deepdish.io/2015/04/28/creating-lmdb-in-python/
Just load all images in
X
and corresponding labels inY
.