I'm using the Keras VGG16 model.
I've seen it there is a preprocess_input method to use in conjunction with the VGG16 model. This method appears to call the preprocess_input method in imagenet_utils.py which (depending on the case) calls _preprocess_numpy_input method in imagenet_utils.py.
The preprocess_input
has a mode
argument which expects "caffe", "tf", or "torch". If I'm using the model in Keras with TensorFlow backend, should I absolutely use mode="tf"
?
If yes, is this because the VGG16 model loaded by Keras was trained with images which underwent the same preprocessing (i.e. changed input image's range from [0,255] to input range [-1,1])?
Also, should the input images for testing mode also undergo this preprocessing? I'm confident the answer to the last question is yes, but I would like some reassurance.
I would expect Francois Chollet to have done it correctly, but looking at https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py either he is or I am wrong about using mode="tf"
.
Updated info
@FalconUA directed me to the VGG at Oxford which has a Models section with links for the 16-layer model. The information about the preprocessing_input
mode
argument tf
scaling to -1 to 1 and caffe
subtracting some mean values is found by following the link in the Models 16-layer model: information page. In the Description section it says:
"In the paper, the model is denoted as the configuration D trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68]."