I have been trying TFLite to increase detection speed on Android but strangely my .tflite model now almost only detects 1 category.
I have done testing on the .pb model that I got after retraining a mobilenet and the results are good but for some reason, when I convert it to .tflite the detection is way off...
For the retraining I used the retrain.py file from Tensorflow for poets 2
I am using the following commands to retrain, optimize for inference and convert the model to tflite:
python retrain.py \
--image_dir ~/tf_files/tw/ \
--tfhub_module https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/feature_vector/1 \
--output_graph ~/new_training_dir/retrainedGraph.pb \
-–saved_model_dir ~/new_training_dir/model/ \
--how_many_training_steps 500
sudo toco \
--input_file=retrainedGraph.pb \
--output_file=optimized_retrainedGraph.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TENSORFLOW_GRAPHDEF \
--input_shape=1,224,224,3 \
--input_array=Placeholder \
--output_array=final_result \
sudo toco \
--input_file=optimized_retrainedGraph.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--output_file=retrainedGraph.tflite \
--inference_type=FLOAT \
--inference_input_type=FLOAT \
--input_arrays=Placeholder \
--output_array=final_result \
--input_shapes=1,224,224,3
Am I doing anything wrong here? Where could the loss in accuracy come from?
Please file an issue on GitHub https://github.com/tensorflow/tensorflow/issues and add the link here.
Also please add more details on what you are retraining the last layer for.
I faced the same issue while I was trying to convert a .pb model into .lite.
In fact, my accuracy would come down from 95 to 30!
Turns out the mistake I was committing was not during the conversion of .pb to .lite or in the command involved to do so. But it was actually while loading the image and pre-processing it before it is passed into the lite model and inferred using
interpreter.invoke()
command.
The below code you see is what I meant by pre-processing:
test_image=cv2.imread(file_name)
test_image=cv2.resize(test_image,(299,299),cv2.INTER_AREA)
test_image = np.expand_dims((test_image)/255, axis=0).astype(np.float32)
interpreter.set_tensor(input_tensor_index, test_image)
interpreter.invoke()
digit = np.argmax(output()[0])
#print(digit)
prediction=result[digit]
As you can see there are two crucial commands/pre-processing done on the image once it is read using "imread()":
i) The image should be resized to the size that is the "input_height" and "input_width" values of the input image/tensor that was used during the training. In my case (inception-v3) this was 299 for both "input_height" and "input_width". (Read the documentation of the model for this value or look for this variable in the file that you used to train or retrain the model)
ii) The next command in the above code is:
test_image = np.expand_dims((test_image)/255, axis=0).astype(np.float32)
I got this from the "formulae"/model code:
test_image = np.expand_dims((test_image-input_mean)/input_std, axis=0).astype(np.float32)
Reading the documentation revealed that for my architecture input_mean = 0 and input_std = 255.
When I did the said changes to my code, I got the accuracy that was expected (90%).
Hope this helps.