I have the following situation:
- I want to deploy a face detector model using Tensorflow Serving: https://www.tensorflow.org/serving/.
- In Tensorflow Serving, there is a command line option called
--enable_batching
. This causes the model server to automatically batch the requests to maximize throughput. I want this to be enabled. - My model takes in a set of images (called images), which is a tensor of shape
(batch_size, 640, 480, 3)
. - The model has two outputs:
(number_of_faces, 4)
and(number_of_faces,)
. The first output will be called faces. The last output, which we can call partitions is the index in the original batch for the corresponding face. For example, if I pass in a batch of 4 images and get 7 faces, then I might have this tensor as[0, 0, 1, 2, 2, 2, 3]
. The first two faces correspond to the first image, the third face for the second image, the 3rd image has 3 faces, etc.
My issue is this:
- In order for the
--enable_batching
flag to work, the output from my model needs to have the 0th dimension the same as the input. That is, I need a tensor with the following shape:(batch_size, ...)
. I suppose this is so that the model server can know which grpc connection to send each output in the batch towards. - What I want to do is to convert my output tensor from the face detector from this shape
(number_of_faces, 4)
to this shape(batch_size, None, 4)
. That is, an array of batches, where each batch can have a variable number of faces (e.g. one image in the batch may have no faces, and another might have 3).
What I tried:
tf.dynamic_partition
. On the surface, this function looks perfect. However, I ran into difficulties after realizing that thenum_partitions
parameter cannot be a tensor, only an integer:tensorflow_serving_output = tf.dynamic_partition(faces, partitions, batch_size)
If the tf.dynamic_partition
function were to accept tensor values for num_partition
, then it seems that my problem would be solved. However, I am back to square one since this is not the case.
Thank you all for your help! Let me know if anything is unclear
P.S. Here is a visual representation of the intended process:
I ended up finding a solution to this using
TensorArray
andtf.while_loop
: