I'd like to compare the performance of following types of CNNs for two different large image data sets. The goal is to measure the similarity between two images, which both have not been seen during training. I have access to 2 GPUs and 16 CPU cores.
- Triplet CNN (Input: Three images, Label: encoded in position)
- Siamese CNN (Input: Two images, Label: one binary label)
- Softmax CNN for Feature Learning (Input: One image, Label: one integer label)
For Softmax I can store the data in a binary format (Sequentially store label and image). Then read it with a TensorFlow reader.
To use the same method for Triplet and Siamese Networks, I'd have to generate the combinations in advance and store them to disk. That would result in a big overhead in both the time it takes to create the file and in disk space. How can it be done on the fly?
Another easy way would be to use feed_dict, but this would be slow. Therefore the problem would be solved if it would be possible to run the same function which I'd use for feed_dict in parallel and convert the result to a TensorFlow tensor as a last step. But as far as I know such a conversion does not exist so one has to read the files with a TensorFlow reader in the first place and do the whole process with TensorFlow methods. Is this correct?