So I have trained a tensorflow model with fake quantization and froze it with a .pb file as output. Now I want to feed this .pb file to tensorflow lite toco for fully quantization and get the .tflite file.
I am using this tensorflow example: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/micro/examples/micro_speech
The part where I have question:
bazel run tensorflow/lite/toco:toco -- \
--input_file=/tmp/tiny_conv.pb --output_file=/tmp/tiny_conv.tflite \
--input_shapes=1,49,43,1 --input_arrays=Reshape_1 --output_arrays='labels_softmax' \
--inference_type=QUANTIZED_UINT8 --mean_values=0 --std_values=2 \
--change_concat_input_ranges=false
The above part calls toco and does the convert. Note that, the mean_values is set to 0, and std_values is set to 2 by Google. How did they calculate these 2 values? For this particular model, it is trained to recognize words "yes" and "no". What if I want to recognize the 10 digits, do I need to change the mean and std values in this case? I didn't find any official documentation illustrating this part. Any help would be appreciated.