I want to utilize not only the feature-extractor pre-trained weights but also the feature-map layers' classifier/localization pre-trained weights for fine-tuning tensorflow object detection models (SSD) using tensorflow object detection API. When my new model has a different number of classes from the pre-trained model that I'm using for the fine-tuning checkpoint, how would the TensorFlow object detection API handle the classification weight tensors?
When fine-tuning pre-trained models in ML object detection models like SSD, I can initialize not only the feature-extractor weights with the pre-trained weights but also initialize the feature-map's localization layer weights and classification layer weights, with latter only choosing the pre-trained class weights of choice, so that I can decrease the number of classes that the model can initially identify (for example from 90 MSCOCO classes to whichever classes of choice within those 90 classes like cars & pedestrian only etc.)
https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb
This is how it's done in keras models (ie in h5 files) and I want to do the same in Tensorflow object detection API as well. It seems that at training time I can specify the number of classes the new model is going to have in the config protobuf file, but since I'm new to the API (and tensorflow) I haven't been able to follow the source structure and understand how that number is going to be handled at fine-tuning. Most SSD models I know just ignore and initialize the classification weight tensor in case the pre-trained model's class weight shape is different from the new model's classification weight shape, but I want to retain the necessary classification weights and train upon those. Also, how would I do that within the API structure?
Thanks!
As I read through the code I found the responsible code, which only retains the pre-trained model's weights if the shape of the layers between the newly-defined model and the pre-trained model match. So if I change the number of the class, the shape of the classifier layers change, and the pre-trained weights are not retained.
https://github.com/tensorflow/models/blob/master/research/object_detection/utils/variables_helper.py#L133