I am running docker image tensorflow:1.1.0. I have added the tensorflow object detection api github by cloning it locally and giving my docker a connection to the folder. I am trying to recreate their pet example.
I believe I have all code and code in the right places. However, when I try to retrain, tensorflow kills itself before starting to train, but does not give any issues or errors.
INFO:tensorflow:Starting Session.
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
Killed
I imagine I have something out or oder, but without any errors or output I don't know where to look!
I am following the guide here to run things locally: link. The pet data was obtained from the same GitHub: link. I got my model configuration from the same GitHub as well. link
I choose inception_v2.
I now realize that I was simply running out of memory as soon as I started. When I switched to
ssd_mobilenet_v1_coco_2017_11_17
and changedbatch_size = 1
, it all worked.This still doesn't entirely fix my problem though. I'll have to figure out how to provide more memory for the call.