TensorRT and Tensorflow 2

2020-07-22 03:15发布

问题:

I am trying to speed up the inference of yolov3 TF2 with TensorRT. I am using the TrtGraphConverter function in tensorflow 2.

My code is essentially this:

from tensorflow.python.compiler.tensorrt import trt_convert as trt

tf.keras.backend.set_learning_phase(0)
converter = trt.TrtGraphConverter(
    input_saved_model_dir="./tmp/yolosaved/",
    precision_mode="FP16",
    is_dynamic_op=True)
converter.convert()


saved_model_dir_trt = "./tmp/yolov3.trt"
converter.save(saved_model_dir_trt)

And this generates the following error:

Traceback (most recent call last):
  File "/home/pierre/Programs/anaconda3/envs/Deep2/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 427, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pierre/Documents/GitHub/yolov3-tf2/tensorrt.py", line 23, in <module>
    converter.save(saved_model_dir_trt)
  File "/home/pierre/Programs/anaconda3/envs/Deep2/lib/python3.6/site-packages/tensorflow/python/compiler/tensorrt/trt_convert.py", line 822, in save
    super(TrtGraphConverter, self).save(output_saved_model_dir)
  File "/home/pierre/Programs/anaconda3/envs/Deep2/lib/python3.6/site-packages/tensorflow/python/compiler/tensorrt/trt_convert.py", line 432, in save
    importer.import_graph_def(self._converted_graph_def, name="")
  File "/home/pierre/Programs/anaconda3/envs/Deep2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/pierre/Programs/anaconda3/envs/Deep2/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 431, in import_graph_def
    raise ValueError(str(e))
ValueError: Input 1 of node StatefulPartitionedCall was passed float from conv2d/kernel:0 incompatible with expected resource.

Does this mean that some of my nodes can't be converted? In this case, why does my code error out during the .save step?

回答1:

I ended up solving this issue with the following code. Also I switched from tf 2.0.-beta0 to tf-nightly-gpu-2.0-preview

params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
    precision_mode='FP16',
    is_dynamic_op=True)
    
converter = trt.TrtGraphConverterV2(
    input_saved_model_dir=saved_model_dir,
    conversion_params=params)
converter.convert()
saved_model_dir_trt = "/tmp/model.trt"
converter.save(saved_model_dir_trt)

thanks for your help



回答2:

When you are using TensorRT please keep in mind that there might be unsupported layers in your model architecture. There is TensorRT support matrix for your reference. YOLO consist a lot of unimplemented custom layers such as "yolo layer".

So, if you want to convert YOLO to TensorRT optimized model, you need to choose from alternative ways.

  1. Try TF-TRT which optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. While you can still use TensorFlow's wide and flexible feature set, TensorRT will parse the model and apply optimizations to the portions of the graph wherever possible.
  2. Implement your custom layers with Plugin API like this example.


回答3:

Might be a bit of a reach, but which gpu are you using? I know that precision_mode="FP16" is just supported in certain architectures, like Pascal (tx2 series) and Turing (~2080 series). I've had good results porting from TF2 to trt with fp16.