Can not keep FPS at 60 on OAK-D-PRO-POE-AF
YunyaHsu
Issue is that for 1Gbps ETH connection, you have about 900megabits of bandwidth.
1080P NV12/YUV420 frames: 1920 * 1080 * 1.5 * 30fps * 8bits = 747 mbps (when encoded this is about half)
400P depth frames: 640 * 400 * 2 * 30fps * 8bits = 123 mbps
If you want higher FPS, you essentially need more bandwidth or have smaller frames.
YunyaHsu Ability to save video only when an object is detected, for example: save 5 seconds before AND after when a person is detected walking by
Don't send the streams back to host (or only send back small preview frame). This will enable you to run the pipeline at higher FPS.
Then use YoloSpatialDetectionNetwork to only send back the depth of the object when it is detected (once the recording is triggered) .
Thanks,
Jaka
Thanks, I remove all detection networks and update pipeline as following:
RGB camera (1080P) ---video---> video encoder (h.265) ---bitstream---> xout
Mono Left & Mono Right camera (both 400P) ---out---> StereoDepth ---depth---> xout
- set fps =
35
on both mono and RBG cameras, it works as expected - set fps =
40
, the real fps is limited at30
only
If my understanding correct, when fps is at 40
the total bandwidth should be around 662 mbps
, so I should like to know why I can not receive encoded frame and depth frame as expected?
1080P NV12/YUV420 frames: 1920 \* 1080 \* 1.5 \* 40fps \* 8bits \* 0.5(when encoded this is about half) = 498 mbps
400P depth frames: 640 \* 400 \* 2 \* 40fps \* 8bits = 164 mbps
Additionally, during testing, there was a phenomenon I didn't understand. I removed the RGB camera and video encoder nodes, and wanted to test the highest acceptable fps with only the StereoDepth node.
The pipeline is:
Mono Left & Mono Right camera (both 400P) ---out---> StereoDepth ---depth---> xout
- set fps =
30
,40
,50
, it works as expected - set fps =
60
, the real fps is lowered to around50
, but the bandwidth in this case should be246 mbps
only.
@jakaskerl
When running RGB camera + video encoder only and set fps at 60
, CPU usage is around 68%
, which is good and received fps is also around 60
.
If I add stereo depth node, means running RGB camera + video encoder + 2 mono cameras + stereo depth and set fps at 40
, CPU usage is pretty high, around 94% - 95%, so I think it's the root cause.
However, according to the specifications, OAK-D-PRO-POE-AF
's RGB camera should have a maximum capability of 60
fps, and the mono cameras can even reach up to 120
fps. If CPU limitations are restricting us to only 35
fps, the observed performance doesn't quite align with the camera's specified capabilities IMHO.
Any suggestion to increase the fps?
[18443010D15F9D0F00] [192.168.0.137] [29.597] [system] [info] Memory Usage - DDR: 63.01 / 333.28 MiB, CMX: 2.41 / 2.50 MiB, LeonOS Heap: 64.56 / 81.76 MiB, LeonRT Heap: 4.93 / 39.90 MiB / NOC ddr: 1408 MB/s
[18443010D15F9D0F00] [192.168.0.137] [29.597] [system] [info] Temperatures - Average: 50.69C, CSS: 52.46C, MSS 49.59C, UPA: 49.81C, DSS: 50.92C
[18443010D15F9D0F00] [192.168.0.137] [29.597] [system] [info] Cpu Usage - LeonOS 94.33%, LeonRT: 23.64%
[18443010D15F9D0F00] [192.168.0.137] [30.598] [system] [info] Memory Usage - DDR: 63.01 / 333.28 MiB, CMX: 2.41 / 2.50 MiB, LeonOS Heap: 64.56 / 81.76 MiB, LeonRT Heap: 4.93 / 39.90 MiB / NOC ddr: 1413 MB/s
[18443010D15F9D0F00] [192.168.0.137] [30.598] [system] [info] Temperatures - Average: 50.30C, CSS: 52.90C, MSS 48.92C, UPA: 49.14C, DSS: 50.25C
[18443010D15F9D0F00] [192.168.0.137] [30.598] [system] [info] Cpu Usage - LeonOS 94.97%, LeonRT: 23.55%
[18443010D15F9D0F00] [192.168.0.137] [31.600] [system] [info] Memory Usage - DDR: 63.01 / 333.28 MiB, CMX: 2.41 / 2.50 MiB, LeonOS Heap: 64.56 / 81.76 MiB, LeonRT Heap: 4.93 / 39.90 MiB / NOC ddr: 1420 MB/s
[18443010D15F9D0F00] [192.168.0.137] [31.600] [system] [info] Temperatures - Average: 50.20C, CSS: 52.02C, MSS 49.14C, UPA: 49.37C, DSS: 50.25C
[18443010D15F9D0F00] [192.168.0.137] [31.600] [system] [info] Cpu Usage - LeonOS 94.03%, LeonRT: 24.51%
Another test with two mono cameras + stereo depth and set fps at 60
, the CPU usage is around 80%
to 85%
only, but the received fps is still lower than 60
, do you know why?
YunyaHsu
It's generally either:
- ISP limitation (500MP/s)
- Node processing
- Bandwidth (especially on POE)
- Host side loop (writing the stream to a mp4 file is costly)
I'd say best to try running the pipeline with DEPTHAI_LEVEL=TRACE
, this will tell you how long each operation takes. Then you can compare with the FPS you are getting and see if it makes sense. Stereo and encoding will probably take the most time, then image acquisition if high res is used.
I doubt it is host side that is the problem, but to check, you simply time the while true loop.
Thanks,
Jaka
@jakaskerl
I suspect the stereo depth node is our primary bottleneck.
My pipeline consists of "2 mono cameras (400P resolution) + stereo depth". I tested it at 20
, 40
, 80
, and 120
fps, running each configuration for approximately 30 seconds.
The logs (including schema dump if relevant) suggest:
- The Mono ISP struggles to maintain the expected fps if I set it higher that 40 fps.
- Frame loss occurs in the 'Stereo rectification' stage at higher frame rates.
Set fps at 20: received about 620 frames, real fps meet expected, CPU usage around 50 - 53%
[MonoCamera(0)] [trace] Mono ISP took xxx
: 623 times
[MonoCamera(1)] [trace] Mono ISP took xxx
: 623 times
Stereo rectification took xxx
: 623 times
Stereo took xxx
: 622 times
'Median+Disparity to depth' pipeline took xxx
: 622 times
Stereo post processing xxx
: 622 times
Received message from device (depth)
: 623 times
Set fps at 40: received only about 860 frames, so the real fps is about 27
-28
only, CPU usage around 85 - 85%.
[MonoCamera(0)] [trace] Mono ISP took xxx
: 922 times <--- shouldn't be 1200?
[MonoCamera(1)] [trace] Mono ISP took xxx
: 926 times
Stereo rectification took xxx
: 872 times <--- decrease a bit
Stereo took xxx
: 869 times
'Median+Disparity to depth' pipeline took xxx
: 869 times
Stereo post processing xxx
: 869 times
Received message from device (depth)
: 865 times
Set fps at 80: received only about 586 frames, so the real fps is about 19
-20
only, CPU usage around 91 - 95%
[MonoCamera(0)] [trace] Mono ISP took xxx
: 1,162 times <--- shouldn't be 2400?
[MonoCamera(1)] [trace] Mono ISP took xxx
: 1,084 times
Stereo rectification took xxx
: 597 times <--- decrease dramatically?
Stereo took xxx
: 594 times
'Median+Disparity to depth' pipeline took xxx
: 594 times
Stereo post processing xxx
: 594 times
Received message from device (depth)
: 591 times
When I set fps as 120
and let it run about 30 seconds, received only about 432 frames, real fps is around 14
, CPU usage around 97 - 99%, pretty high.
[MonoCamera(0)] [trace] Mono ISP took xxx
: 1,578 times <--- shouldn't be 3600?
[MonoCamera(1)] [trace] Mono ISP took xxx
: 1,265 times
Stereo rectification took xxx
: 444 times <--- decrease dramatically as well?
Stereo took xxx
: 442 times
'Median+Disparity to depth' pipeline took xxx
: 442 times
Stereo post processing xxx
: 442 times
Received message from device (depth)
: 437 times
YunyaHsu
If you checked the operation times you would see the max time taken for processing is by stereo (about 7.5ms) which is fine for 100FPS if you want. Tested the code on USB and got 100FPS no problem.
import depthai as dai
from FPS import FPS
import time
import numpy as np
fps = 100
pipeline = dai.Pipeline()
pipeline.setXLinkChunkSize(0)
mono_left = pipeline.create(dai.node.MonoCamera)
mono_left.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
mono_left.setBoardSocket(dai.CameraBoardSocket.LEFT)
mono_right = pipeline.create(dai.node.MonoCamera)
mono_right.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
mono_right.setBoardSocket(dai.CameraBoardSocket.RIGHT)
mono_left.setFps(fps)
mono_right.setFps(fps)
stereo_depth = pipeline.create(dai.node.StereoDepth)
#stereo_depth.setOutputSize(mono_left.getResolutionWidth(), mono_left.getResolutionHeight())
stereo_depth.setMedianFilter(dai.StereoDepthProperties.MedianFilter.KERNEL_7x7)
mono_left.out.link(stereo_depth.left)
mono_right.out.link(stereo_depth.right)
xout_depth = pipeline.create(dai.node.XLinkOut)
xout_depth.setStreamName("depth")
# stereo_depth.depth.link(sc
script = pipeline.create(dai.node.Script)
script.setScript("""
import time
queue = node.io['depth']
fps = 0
start = time.monotonic()
while True:
msg = queue.get()
fps += 1
if fps % 100 == 0:
node.warn(f"{fps / (time.monotonic() - start)} fps")
start = time.monotonic()
fps = 0
""")
stereo_depth.depth.link(script.inputs['depth'])
with dai.Device(pipeline) as device:
start = time.time()
while device.isPipelineRunning():
pass
In NETWORK bootloader mode the network stack is initialized on the same CPU as the rest of the pipeline. The drop in processing is significant enough to drop the framerate from 100FPS to about 60FPS without doing anything at all.
Will have to be debugged in the FW (this resource hog is crazy too much).
Thanks,
Jaka
@jakaskerl
Thank you for the test code provided, I confirmed that when testing the same script on my OAK-D-PRO-POE-AF
I also only got around 60
fps .
Could you provide an estimated timeframe for when I might expect a follow-up response or solution to this concern?
As we need higher fps on RGB camera, our team is considering purchasing OAK-D Pro PoE OV9782
for the project. However, our decision heavily depends on the resolution of this NETWORK bootloader mode CPU consumption issue.
Can you confirm whether this problem will be addressed, and if so, by when? This information is crucial for my purchase decision.
Thank you in advance.
@jakaskerl
It's sad to hear that there's no clear timeline for resolving this issue.
However, USB device is not our primary choice for this application. Our use case is expected to be outdoors, where the distance between the camera and the host computer is likely to exceed 5 meters, and possibly be much further.
Given these circumstances, PoE devices remain our preferred option. The extended range and single-cable solution for both power and data make PoE cameras much more suitable for our outdoor deployment needs.
@jakaskerl
Thank you for providing the information. After checking, unfortunately the CM4 PoE device does not meet our requirements as we need a global shutter to avoid the jello effect that can occur during high-speed photography.