Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anyway to boost yolo performance on Jetson Orin? #605

Open
lida2003 opened this issue Dec 18, 2024 · 5 comments
Open

Anyway to boost yolo performance on Jetson Orin? #605

lida2003 opened this issue Dec 18, 2024 · 5 comments

Comments

@lida2003
Copy link

lida2003 commented Dec 18, 2024

Hi

I know that there is performance test on yolo documentation, which is deepstream-nvidia-jetson.

Yes, it's pretty cool. Here is my situation which is used for FPV realtime performance.

Pre-condition: (Jetson Orin Nano 8GB)

  • Camera stream: 60FPS (stable, even can up to 120FPS)
  • Stream Codec: H264/H265, currently test with H264
  • Resolution: 1920 x 1080
  • Model: Yolov8n
  • Deepstream 6.3/Jetpack 5.1.4 L4T 35.6 (ubuntu20.4)

Now I got ~50FPS with above pre-condition. And I want to ask is there any way to boost FPS rate to or above 60FPS without changing resolution?


EDIT1: with utils/export_* exported model onnx file, I'm NOT sure how to configure engine file to be FP32?FP16?INT8 precision. As it seems INT8 precision can get high FPS rate.

EDIT2:I found two issue

  1. When using Bytetrack, I noticed that the FPS did not improve.
  2. When the interval is set to 2 or 3, the bounding box flashes, and it seems that the target can only be tracked when the detection happens on every nth frame. How can I make Bytetrack work in such a way that the bounding box continuously tracks the target, just like when the interval is set to 0?
@marcoslucianops
Copy link
Owner

marcoslucianops commented Dec 19, 2024

Now I got ~50FPS with above pre-condition. And I want to ask is there any way to boost FPS rate to or above 60FPS without changing resolution?

Try to set the board to MAXN power.
What is the resolution of your model (exported ONNX)?

I'm NOT sure how to configure engine file to be FP32?FP16?INT8 precision

You can set in the network-mode on the config infer file. (0=FP32, 1=INT8 and 2=FP16). The INT8 requires calibration, and the accuracy is lower than FP16/FP32. I recommend you to use FP16. Also remember to rename the engine path and remove the old generated engine files (on the next run, the DeepStream will generate a new engine with FP16 precision).

I found two issue

  1. ByteTracker isn't faster than the NvTracker with basic settings.

  2. This is expected when using interval. It skips the frames on the inference.

@lida2003
Copy link
Author

lida2003 commented Dec 20, 2024

Try to set the board to MAXN power.

I have had max power for jetson orin I think.

    sudo nvpmodel -m 0
    sudo jetson_clocks

What is the resolution of your model (exported ONNX)?

python3 ./utils/export_yoloV8.py -w yolov8s.pt --dynamic as guide says: the default is 640

You can set in the network-mode on the config infer file. (0=FP32, 1=INT8 and 2=FP16).

Great! speed up from ~50 FPS to almost 60FPS

network-mode=2
model-engine-file=model_b1_gpu0_fp16.engine
  1. ByteTracker isn't faster than the NvTracker with basic settings.

OK

  1. This is expected when using interval. It skips the frames on the inference.

Is there any way to keep the bounding box(exstimate box) when using interval or equivalent method?


Finally, what onnx is suitable for my jetson orin 8GB board? The real question right now is "I don't know how to make the best performance of the hardware"

My goal is optimize the performance as follows:

  • First maxmize hardware performance
  • Secondly adjust to achieve higher FPS over 60FPS, I think stable at 65 or above is accepted, which can handle FPV video feed.
  • Finally consider other params such as accuracy/algorithm etc.

From the ./utils/export_yoloV8.py script, what do you recommend for:
--batch ???
--size ???

And if it's possible for me (as the above condition and BYTETrack or NvTracker) to track specified object, only person, car, not the over all 80 classes (which might be more efficient, squeezing the performance a little bit)?

EDIT: Does maxShadowTrackingAge should be large than interval? As I did see this param set in NvMOT_Query of BYTETrack lib.

@lida2003
Copy link
Author

The INT8 requires calibration, and the accuracy is lower than FP16/FP32.

Is there any document or link for calibration?

@marcoslucianops
Copy link
Owner

marcoslucianops commented Dec 26, 2024

Is there any way to keep the bounding box(exstimate box) when using interval or equivalent method?

Only with custom codes. It requires more complex changes on the codes. See https://github.com/NVIDIA-AI-IOT/deepstream_python_apps

From the ./utils/export_yoloV8.py script, what do you recommend for

You can use the dynamic (for dynamic batch), and the size depends on the trained size. For the COCO model, it should be the default (640). You can use lower resolution, but it will decrease the accuracy.

And if it's possible for me (as the above condition and BYTETrack or NvTracker) to track specified object, only person, car, not the over all 80 classes (which might be more efficient, squeezing the performance a little bit)?

By default, I don't think so.

EDIT: Does maxShadowTrackingAge should be large than interval?

This isn't related to interval. It's related to how many frames the tracker will keep the object on the history when the object isn't detected by the inference. I recommend you to see the gst-nvtracker on the DeepStream website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants