I am working on object detection using Tensorflow.js. I am trying to run custom object detection tensorflow.js model in a browser. I could able to convert tensorflow model to tensorflow.js model (in google colab) using the following command:
!tensorflowjs_converter \
--input_format=tf_frozen_model \
--output_node_names='detection_boxes,detection_scores,detection_classes,num_detections' \
/content/frozen_inference_graph.pb \
/content/web_model
I am sharing the code snippet of inference.html
file [Updated]:
<html>
<head>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"> </script>
<!--<script src="webcam.js"></script>-->
<img id="img" src="257.jpg" width="300" height="300"/>
</head>
<body>
<button type="button" id="startPredicting" onclick="startPredicting()" >Start Predicting</button>
<button type="button" id="stopPredicting" onclick="stopPredicting()" >Stop Predicting</button>
<div id="prediction"></div>
</body>
<script src="index.js"></script>
</html>
The code snippet of index.js file is as follow [Updated]:
let model;
let isPredicting = false;
async function init(){
try {
model = await tf.loadGraphModel('http://127.0.0.1:8887/uno_model/model.json');
} catch (err) {
console.log(err);
}
}
async function predict() {
console.log("executing model");
const img = document.getElementById('img');
tf_img = tf.browser.fromPixels(img);
tf.print(tf_img)
tf_img = tf_img.expandDims(0);
console.log(tf_img.shape) // Image dimension is [1, 300, 300, 3]
let output = await model.executeAsync(
{ 'image_tensor' : tf_img},
[ 'detection_boxes','detection_scores','detection_classes','num_detections']);
for (let i = 0; i < output.length; i++){
console.log(output[i].dataSync())
}
}
init()
function startPredicting(){
isPredicting = true;
predict();
}
function stopPredicting(){
isPredicting = false;
predict();
}
It produces following output [Updated]:
I looked at the above output but I couldn't get class labels etc. How can I extract detection_classes, detection_scores, and detection_boxes? This model works properly with python code.
[Updated]: It seems like, I am getting the output after providing [1,300,300,3] image as input to the model.
Could you please guide me? Am I missing something?
Finally, I could figure out the problem and it was related to the size of an input frame.
SSD model needs shape of
[1,300,300,3]
image/frame as input. I added this in my code and got the solution. Using the following line (ininference.html
), we can feed(300,300,3)
shape of image as an input to the model:Using the following lines in
index.js
:We obtain image shape of
[1,300,300,3]
which is needed by SSD.Though the python model is not added to the question, but the extracted nodes of the model, the size of the tensor and their type provide an insight to identify the detection_classes, detection_scores, and detection_boxes.
The first tensor has the size 400 and corresponds to
detection_boxes
. The dataSync operator returns a completely flatten array. The size 400 most likely corresponds to the shape[100, 4]
. This is corroborated by the shape of the other tensors that is discussed below.[100, 4]
means that there are 100 bounding boxes in the input - most likely an image. Concretely, it means that the first four elements corresponds to the first bounding box, and so on...The second tensor corresponds to the
detections_scores
. There are 100 detections scores for the 100 bounding boxes. The first element of this array correspond to the first four elements of the first array (detection_boxes
array)The third array corresponds to the
detection_classes
. It is an array of 100 integers where each value is the index of the matched label.The forth array correspond to the
num_detections
. It contains how many detections there are:100
To get the label (string), the index taken from the
detection_classes
should be used against the json (dictionary in python) or array that contains all the labels and their indexes.It is noteworthy to indicate that for the js model to return the same output as the python model, all the processing done to the image in python prior to feeding the model should be replicated in js.