How to parse Yolov5 output #45

Corallo · 2024-04-12T15:55:57Z

Corallo
Apr 12, 2024

This is an informal guide to help users understand how to handle the output of a quantized yolov5 that can be obtained by following the guide of this repository.
It is not polished and might contain errors
You might be used to a different output format, because the scripts in the yolov5 repository usually post-process the output for you.

Once you have your model, you can create an ACAP that runs it on the device.
There are currently no examples of an ACAP running specifically Yolov5, but there are many examples showing how to run other models, that can be adapted, see minimal ml inference, object detection python, pose estimator with flask and the native object detector.

One tricky part of integrating Yolo in your ACAP could be about understanding the output format.
as output, you will get an array of Nx(5+C)
Where C is the number of classes specified in your training and N is a fixed number of detections you get (typically 25200).
So for each detection, you will get an array of (5+C) elements, that will look like this:

[x, y, w, h, object_likelihood, class1_likelihood, class2_likelihood, class3_likelihood, ... ]

x and y are the coordinates of the center of your detection box,
w and h are the width and height of your detection box,
object_likelihood can be interpreted as the likelihood that box is "something"
class1_likelihood is the likelihood that the object is of class 1 and so on.

Another thing to keep in mind is that the output is quantized.
To obtain meaningful values, (especially for the box locations), you need to apply a shift and scale to the output values.

The quantization parameters can be found before developing the ACAP with this code snippet:

import tensorflow as tf
interpreter = tf.lite.Interpreter(model_path="best-int8.tflite")
interpreter.allocate_tensors()
output_details = interpreter.get_output_details()
scale, zero_point = output_details[0]['quantization']
print(scale, zero_point)

The quantization parameters will not change, unless the model changes, so they could be hardcoded as part of your ACAP.

Here is another pseudo-code snippet to give an idea of how to process your output:

threshold = 0.25
zero_point = 2 # to be changed according to the result obtained from the previous snipped
scale = 0.005952381 # to be changed according to the result obtained from the previous snipped
success, result = inf_client.infer({'data': image}, os.environ['MODEL_PATH'])
output_tensor = result['StatefulPartitionedCall:0']
output_tensor = (output_tensor - zero_point) * scale # dequantize
output_tensor = output_tensor[output_tensor[:, 4] > threshold] # filter detections
output_tensor = output_tensor[(-output_tensor[:, 4]).argsort()] # sort them by likelihood
for entry in output_tensor:
   x, y, w, h, _, class_likelihood = entry
    x1 = int(max(x - (w/2) , 0)*img_width)
    y1 = int(max(y - (h/2) , 0)*img_height)
    x2 = int(min(x + (w/2) , 1)*img_width)
    y2 = int(min(y + (h/2) , 1)*img_height)
    print(x1, y1, x2, y2, class_likelihood)

Keep in mind that these snippets are made to offer high level guidance. They are "pseudo code"
Feedback is always appreciated!

AlejandroDiazD · 2024-04-19T18:09:48Z

AlejandroDiazD
Apr 19, 2024

Hello again,

I have been trying to do almost the same steps previous to reading this post, so this might be a good sign to be going into the right solution.
The point is that after applying dequantization, filtering and sorting by likelihood (also referred as objectness) the results don't seem to make sense...I have also tried applying an activation function (I have tried with sigmoid and softmax), but again the results are not as expected...

I will keep trying to solve it, just let me know if you have any idea to try, or much more better, if you have solution that you have already tested.

Thank you for your help, I will also let you know if I have any progress.

3 replies

Corallo Apr 19, 2024
Author

An extra hint to debug your model, that I think make sense to post here because it can be helpful for others:
In the Yolo repository there is a script called detect.py, try it against one picture, using the same tflite weights you use in your camera.
If the result is positive, you can be certain that the model is good, then you can focus on debugging the post-processing.

labaxisAI2 May 29, 2024

@AlejandroDiazD Did you ever get your yolo model to successfully run, better classification and boundary boxes?

labaxisAI2 May 29, 2024

well found my issue, I was not passing my resized frame correctly.

pires-arthur · 2024-07-12T11:20:26Z

pires-arthur
Jul 12, 2024

Hi guys,

How are you doing?

Has anyone managed to parse the Yolo output?

I think I can extract the results, but I still haven't been able to manage the detection and apply the bounding boxes.

Does anyone have a tip to do it. hahaha

Thanks guys!

5 replies

vivekatoffice Jul 12, 2024

Hi @pires-arthur ,
Check out the following 599 may help you.

labaxisAI2 Aug 19, 2024

Curious, for my yolo models I am achieving ~120ms for inference. I am currently using the pose estimator with flask CV SDK example for development. Does flask slow this process down, looks like the object detector has significantly better results.
(I know creating a native ACAP would be best, but that's a future project.)

Corallo Aug 21, 2024
Author

120 ms seems slower than expected.
For Yolov5n we expect ~55 ms on a Q1656
Check step 2 of this discussions: #50
to see if you correctly applied the patch to optimize Yolo for A8.
There could be many other factors, like:

Different model size or input resolution
Different camera model
As you said, having the flask server running could take resources away from the inference.

labaxisAI2 Aug 21, 2024

Agreed! I am using Q3626-VE and @3628-VE.
Inspecting the model architecture looks useful however I am unsure what to look for.
My Model does match the "correct" image in post #50 in the initial stage, but my model continues on and becomes very complicated. If I wanted to research this further, what would it be topic I should look into?

If I play with the Camera capture mode/Power line freq. in the GUI would that help? For example, the fps or aspect ratio or resolution?
I'm using v5s but can try v5n later. I can also remove the classes I don't need however there still must be something on myside if a baseline v5s model is at 74ms and im seeing 120-130ms.

Corallo Aug 21, 2024
Author

120 seems still too large also for the small version of the model.

What you should focus on is if you see "relu" after convolutions instead of "logistic".
Logistic activation functions are slower. Step 1 of our guide apply a patch to make you not use them.

You can compare your model with this, but it would be quicker if you just check that your model have a layer called "relu" after the convolution, as shown in the picture of the other discussion.

Consider also that our results are referring to what you would get with a native ACAP.
The ACAP runtime (inference server for container ACAPs) add extra latency for the grpc communication.

A very inaccurate estimation is that you get 1 ms of latency for each 50KB of data sent or received.
If you are using 640x640 resolution images that should be ~1228KB.
If you get back a tensor of 25200*(5+C), assuming you have only 1 class, that should be 151 KB
The total of 1228 + 151 = 1379 KB should explain extra 27ms of latency

In your latency are you counting only the inference? Or also the time to get the frame?
For the same grpc overhead, getting a frame will cost some extra latency.

I would not expect the capture mode or the power line freq of the camera to have an impact, but you could try.
Reducing input resolution of the model will increase speed at the cost of accuracy.

You can also test your model in isolation, following this guide to see if the latency difference is about something wrong with the model or your application.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to parse Yolov5 output #45

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 8 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to parse Yolov5 output #45

Replies: 2 comments · 8 replies

Corallo Apr 19, 2024 Author

Corallo Aug 21, 2024 Author

Corallo Aug 21, 2024 Author

Replies: 2 comments 8 replies

Corallo Apr 19, 2024
Author

Corallo Aug 21, 2024
Author

Corallo Aug 21, 2024
Author