Maintaining Frame Rate While Adding Depth Post-Processing Features to Code

susage · Jul 12, 2023

Dear Luxonis Community,

I hope this message finds you well. I am currently working on a project using your exceptional depth sensing technologies, and I am facing a challenge that I hope to get your guidance on.

Recently, I have added Depth Post-Processing features into my code to reduce noise, smooth the depth map, and improve the overall quality of the depth map. These features are added to the StereoDepth node. However, following the integration of these features, I've noticed a substantial reduction in the detection frame rate. This issue has presented a significant hurdle in achieving the performance goals for my project.

Given the importance of these Depth Post-Processing features for the quality of the depth map, I'm eager to find a solution that allows their inclusion without compromising the frame rate.

Considering your expertise in this area, I am writing to seek your advice on how I can best handle this issue. Are there specific optimization techniques or strategies that you could suggest that might help maintain the original frame rate while still enabling the addition of Depth Post-Processing features? Any suggestions or insights would be incredibly valuable.

I am more than willing to modify my current code and methodologies to achieve the desired performance, and any examples or resources that you could provide to guide me in this would be greatly appreciated.

Thank you in advance for your time and support. Your community has been a valuable resource so far, and I am confident that your guidance will be indispensable in overcoming this current challenge.

Best regards,

susage

jakaskerl · Jul 14, 2023

Hi susage
Which postprocessing features have you added to the pipeline? Usually, node based (hardware accelerated) postprocessing should have minimal effect on the end framerate.

Thanks,
Jaka

susage · Jul 26, 2023

In my code, I have added the following post-processing functions:

$$
pythonCopy codeconfig = stereo.initialConfig.get()
config.postProcessing.speckleFilter.enable = True
config.postProcessing.speckleFilter.speckleRange = 50
config.postProcessing.temporalFilter.enable = False
config.postProcessing.spatialFilter.enable = False
config.postProcessing.spatialFilter.holeFillingRadius = 2
config.postProcessing.spatialFilter.numIterations = 1
config.postProcessing.thresholdFilter.minRange = 200
config.postProcessing.thresholdFilter.maxRange = 1500
config.postProcessing.decimationFilter.decimationFactor = 1
stereo.initialConfig.set(config)
$$

I found that I can't use the spatialFilter because once I add this feature, the frame rate will lose a third.

jakaskerl · Jul 26, 2023

Hi susage
What fps are you striving to achieve?
Try swapping spatial filter with l_r check since they do a similar thing to see if there are any speed improvements.

Thanks,
Jaka

susage · Jul 28, 2023

Hi，

I want to achieve a speed of at least 60fps. Currently, I have used the lowest resolution with the YOLO algorithm, achieving a speed of 45fps. I hope it could be faster.

Thanks

Sage

jakaskerl · Jul 28, 2023

Hi susage
Can you add some reproducible code so we can try to tweak it if we can?

Thanks,
Jaka

susage · Aug 4, 2023

I'm not sure how to create reproducible code for the issue. Could you give me some tips or guidance?I believe the frame rate is related to the YOLO algorithm I'm using. I'm using YOLOv6, which currently provides the highest detection frame rate among the YOLO series algorithms.

erik · Aug 4, 2023

susage could you share the model (original, not blob), together with the minimal code you are using to measure the FPS?

susage · Aug 6, 2023

1.The code used to measure the FPS in my code refers to the official script.

startTime = time.monotonic()

counter = 0

fps = 0

...

while True:

...

counter += 1

current_time = time.monotonic()

if (current_time - startTime) > 1:

fps = counter / (current_time - startTime)

counter = 0

startTime = current_time

...

cv2.putText(frame, "NN fps: {:.2f}".format(fps), (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4, color)

2.The model I use for object detection was trained with YOLOv6, and the input size is 320.I have placed my model in Google Drive. You can access it through the following link: https://drive.google.com/file/d/1IBBZ1G9plWq77o01H24GI9mdWWw_DNJ4/view?usp=drive_link

jakaskerl · Aug 7, 2023

Hi susage
I am able to run the inference at 60FPS by feeding it rgb frames at 6 shaves compiled model. Since I don't have your code, I cannot run it on depth. What framerate are you getting by using stock depth configuration?
I have a feeling it is the depth part that is the bottleneck here.

Thoughts?
Jaka

susage · Aug 8, 2023

The bottleneck indeed comes from the depth. When I use the stock depth configuration for real-time detection, the highest I can get is only 45FPS.

jakaskerl · Aug 8, 2023

Hi susage
I'd suggest running your script in trace mode with DEPTHAI_LEVEL=trace env variable. This way you should be able to see how much time it takes for a node to process the image.

Thanks,
Jaka

susage · Aug 9, 2023

Okay, thank you for your suggestion;I will try it next.