Trying to understand the FPS limit using Yolo v8 nano to do counting/tracking of objects.

I'm running a script that basically counts objects once they cross a line.
I'm using (custom - 1 class) YOLOv8 nano. I'm only able to get about 12-13 FPS, and I want to know how much I can aim to improve it.
Technical:
I'm using a

In addition, the Python script runs several other things (including writing to a file once in a while, etc.)

I would appreciate any suggestions on how to improve it or what I can expect as the FPS limit.

Hi @leeor
You can time the script:

  • pipeline (device) latency: DEPTHAI_LEVEL=TRACE python3 script.py will tell you how much time each operation takes
  • host loop: time.monotonic() to check if perhaps the loop is taking too long.

Generally speaking, 640x640 is a relatively big input for the v8 model on the RVC2. But since you are using CM4, it might be that display takes up the time.

Thanks,
Jaka

    jakaskerl
    I'll try to test the trace command you gave, thanks!
    I also removed all the logic from my code and just tested the loop (with displaying the image), the FPS still was about 12-13 😐

    What is the recommended size for v8? Is it 320x320?
    I want to show the image at the end, and when I tried 320x320 size (the FPS was about 19-20), it was too small to see anything.
    How can I do the detections on a smaller size but then increase the output result (with bounding box, etc.) to be at least 640 (and still have a good quality)? I saw this link, but it looks like it does the opposite (from a larger image to smaller?) Or which one of the options is more relevant?

    @jakaskerl Last question, since I do transfer learning, and on the Luxonis site to convert weights to blobs I saw only YOLO versions, I limit myself to it. Is there an option to use different models? (will need to train and somehow covert to .blob)

      leeor What is the recommended size for v8? Is it 320x320?

      Check https://docs.luxonis.com/projects/hardware/en/latest/pages/rvc/rvc2/#rvc2-nn-performance and the estimation below.

      leeor How can I do the detections on a smaller size but then increase the output result (with bounding box, etc.) to be at least 640 (and still have a good quality)? I saw this link, but it looks like it does the opposite (from a larger image to smaller?) Or which one of the options is more relevant?

      You were looking at the correct guide, 4th option - scaling bboxes from smaller frame and displaying them on the larger frame:
      luxonis/depthai-experimentsblob/master/gen2-display-detections/4-edit_bb.py

      leeor Last question, since I do transfer learning, and on the Luxonis site to convert weights to blobs I saw only YOLO versions, I limit myself to it. Is there an option to use different models? (will need to train and somehow covert to .blob)

      It's possible to use other models, but you will need custom decoding functions on host. Currently MBnet and Yolo's can run decoding on device, that's why most models are from those two categories. ž

      Thanks,
      Jaka