OAK-D Pro - Depth frames lagging behind.

IgorMasin · Sep 12, 2022

Hello!

I am trying to get depth information from the camera by using the code below. it works, but the problem I am facing is that the frames are lagging behind. it works at around 5 FPS, so that would be good enough for me. But the problem is that the Frame recorded now, is lagging behind about 15 frames so it is displayed only 3 seconds later...

Is there a way to solve this problem? What am I doing wrong?

(Please ignore the mess in the code, we are trying multiple different things... )

thank you


import math

import cv2
import depthai as dai
import time
import numpy as np

CUSTOM_DEPTH_CALC = False
AVG_QTY = 1
GROUND = 1120
XL = [0, 0]
YL = [0, 0]
XR = [0, 0]
YR = [0, 0]

XL[0] = 0.35
YL[0] = 0.25
XR[0] = 0.55
YR[0] = 0.45

XL[1] = 0.35
YL[1] = 0.55
XR[1] = 0.55
YR[1] = 0.75
MIDDLE = 250

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
monoLeft = pipeline.create(dai.node.MonoCamera)
monoRight = pipeline.create(dai.node.MonoCamera)
stereo = pipeline.create(dai.node.StereoDepth)
spatialLocationCalculator = pipeline.create(dai.node.SpatialLocationCalculator)
spatialLocationCalculator2 = pipeline.create(dai.node.SpatialLocationCalculator)

xoutDepth = pipeline.create(dai.node.XLinkOut)
xoutSpatialData = pipeline.create(dai.node.XLinkOut)
xinSpatialCalcConfig = pipeline.create(dai.node.XLinkIn)
xinSpatialCalcConfig2 = pipeline.create(dai.node.XLinkIn)

xoutDepth.setStreamName("depth")
xoutSpatialData.setStreamName("spatialData")
xinSpatialCalcConfig.setStreamName("spatialCalcConfig")
xinSpatialCalcConfig2.setStreamName("spatialCalcConfig2")

# Properties
monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_800_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_800_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)

lrcheck = True
subpixel = False

stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
stereo.setLeftRightCheck(lrcheck)
stereo.setSubpixel(subpixel)
stereo.setExtendedDisparity(True)

# Config
topLeft = dai.Point2f(XL[0], YL[0])
bottomRight = dai.Point2f(XR[0], YR[0])

topLeft2 = dai.Point2f(XL[1], YL[1])
bottomRight2 = dai.Point2f(XR[1], YR[1])

config = dai.SpatialLocationCalculatorConfigData()
config.depthThresholds.lowerThreshold = 100
config.depthThresholds.upperThreshold = 1500
config.roi = dai.Rect(topLeft, bottomRight)

config2 = dai.SpatialLocationCalculatorConfigData()
config2.depthThresholds.lowerThreshold = 100
config2.depthThresholds.upperThreshold = 1500
config2.roi = dai.Rect(topLeft2, bottomRight2)

spatialLocationCalculator.inputConfig.setWaitForMessage(False)
spatialLocationCalculator.initialConfig.addROI(config)

spatialLocationCalculator2.inputConfig.setWaitForMessage(False)
spatialLocationCalculator2.initialConfig.addROI(config2)

# Linking
monoLeft.out.link(stereo.left)
monoRight.out.link(stereo.right)

spatialLocationCalculator.passthroughDepth.link(xoutDepth.input)
stereo.depth.link(spatialLocationCalculator.inputDepth)

spatialLocationCalculator.out.link(xoutSpatialData.input)
xinSpatialCalcConfig.out.link(spatialLocationCalculator.inputConfig)

spatialLocationCalculator2.passthroughDepth.link(xoutDepth.input)
stereo.depth.link(spatialLocationCalculator2.inputDepth)

spatialLocationCalculator2.out.link(xoutSpatialData.input)
xinSpatialCalcConfig2.out.link(spatialLocationCalculator2.inputConfig)

# Connect to device and start pipeline
with dai.Device(pipeline) as device:
    depthQueue = device.getOutputQueue(name="depth", maxSize=4, blocking=False)
    spatialCalcQueue = device.getOutputQueue(name="spatialData", maxSize=4, blocking=False)

    device.setIrLaserDotProjectorBrightness(600)  # in mA, 0..1200
    #device.setIrLaserDotProjectorBrightness(100) # in mA, 0..1200

    start_time = time.time()
    counter = 0
    fps = 0

    while True:
        zavg0 = []
        zavg1 = []
        debugPixel = []
        debugPixel0 = []
        debugPixel1 = []

        xmin0 = 1
        xmax0 = 1
        ymin0 = 1
        ymax0 = 1

        xmin1 = 1
        xmax1 = 1
        ymin1 = 1
        ymax1 = 1

        for cnt in range(AVG_QTY):
            depthFrame = depthQueue.get().getFrame()  # depthFrame values are in millimeters

            if CUSTOM_DEPTH_CALC:
                for i in range(2):
                    topLeftX = XL[i] * depthFrame.shape[1]
                    topLeftY = YL[i] * depthFrame.shape[0]
                    bottomRightX = XR[i] * depthFrame.shape[1]
                    bottomRightY = YR[i] * depthFrame.shape[0]
                    n, m = int(abs(topLeftX - bottomRightX)), int(abs(topLeftY - bottomRightY))
                    roi_center = (int((topLeftX + bottomRightX) / 2), int((topLeftY + bottomRightY) / 2))
                    bottom = roi_center[1] + int(np.floor(m / 2))
                    top = roi_center[1] - int(np.ceil(m / 2))
                    left = roi_center[0] - int(np.floor(n / 2))
                    right = roi_center[0] + int(np.ceil(n / 2))
                    depthRoi = depthFrame[top:bottom, left:right]
                    depthRoiFlat = sorted(depthRoi.flatten())
                    while 0 in depthRoiFlat:
                        depthRoiFlat.remove(0)
                    depthRoiSmall = depthRoiFlat[:int(len(depthRoiFlat)/20)]
                    avg = np.average(depthRoiSmall)

                    if i == 0:
                        if len(depthRoiSmall) > 0:
                            zavg0.append(int(avg))
                        else:
                            zavg0.append(0)
                        xmin0 = left
                        ymin0 = top
                        xmax0 = right
                        ymax0 = bottom
                    else:
                        if len(depthRoiSmall) > 0:
                            zavg1.append(int(avg))
                        else:
                            zavg1.append(0)
                        xmin1 = left
                        ymin1 = top
                        xmax1 = right
                        ymax1 = bottom

            if not CUSTOM_DEPTH_CALC:
                for i in range(2):
                    spatialData = spatialCalcQueue.get().getSpatialLocations()
                    for depthData in spatialData:
                        roi = depthData.config.roi
                        roi = roi.denormalize(width=depthFrame.shape[1], height=depthFrame.shape[0])
                        print(roi.topLeft().y, MIDDLE)
                        if roi.topLeft().y < MIDDLE:
                            xmin0 = int(roi.topLeft().x)
                            ymin0 = int(roi.topLeft().y)
                            xmax0 = int(roi.bottomRight().x)
                            ymax0 = int(roi.bottomRight().y)
                            #zavg0.append(int(depthData.spatialCoordinates.z))
                            zavg0.append(int(depthData.depthAverage))
                        else:
                            xmin1 = int(roi.topLeft().x)
                            ymin1 = int(roi.topLeft().y)
                            xmax1 = int(roi.bottomRight().x)
                            ymax1 = int(roi.bottomRight().y)
                            #zavg1.append(int(depthData.spatialCoordinates.z))
                            zavg1.append(int(depthData.depthAverage))

            depthFrameColor = cv2.normalize(depthFrame, None, 255, 0, cv2.NORM_INF, cv2.CV_8UC1)
            depthFrameColor = cv2.equalizeHist(depthFrameColor)
            depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)

        fontType = cv2.FONT_HERSHEY_DUPLEX

        avg0 = 0
        if len(zavg0) > 0:
            avg0 = GROUND - (sum(zavg0) / len(zavg0))
        crates0 = math.ceil((avg0-15) / 30.3)

        avg1 = 0
        if len(zavg1) > 0:
            avg1 = GROUND - (sum(zavg1) / len(zavg1))
        crates1 = math.ceil((avg1-15) / 30.3)

        color = (255, 255, 255)
        textColor = (0, 255, 0)

        cv2.rectangle(depthFrameColor, (xmin0, ymin0), (xmax0, ymax0), color, cv2.FONT_HERSHEY_SCRIPT_SIMPLEX)
        cv2.putText(depthFrameColor, "Crate h: " + "{0:.2f}".format(avg0) + "mm", (xmin0 + 10, ymin0 - 50), fontType, 0.5, textColor)
        cv2.putText(depthFrameColor, "Crates: " + str(crates0), (xmin0 + 10, ymin0 - 100), fontType, 0.5, textColor)

        cv2.rectangle(depthFrameColor, (xmin1, ymin1), (xmax1, ymax1), color, cv2.FONT_HERSHEY_SCRIPT_SIMPLEX)
        cv2.putText(depthFrameColor, "Crate h: " + "{0:.2f}".format(avg1) + "mm", (xmin1 + 10, ymin1 - 50), fontType, 0.5, textColor)
        cv2.putText(depthFrameColor, "Crates: " + str(crates1), (xmin1 + 10, ymin1 - 100), fontType, 0.5, textColor)

        label_fps = "Fps: {:.2f}".format(fps)
        (w1, h1), _ = cv2.getTextSize(label_fps, cv2.FONT_HERSHEY_TRIPLEX, 0.4, 1)
        cv2.putText(depthFrameColor, label_fps, (10, 10), cv2.FONT_HERSHEY_TRIPLEX, 0.4, textColor)

        cv2.imshow("depth", depthFrameColor)

        # Quit
        key = cv2.waitKey(1)
        if key == ord('q'):
            break

erik · Sep 12, 2022

Hi IgorMasin ,
You should look at message syncing demos, which will resolve the issue where one stream is lagging behind the other Let us know if that helps.
Thanks, Erik

IgorMasin · Sep 12, 2022

Thank you for your answer. I am not entirely sure i can follow tough. its my understanding (probably wrogn) that the message synching is for synching multiple cameras, or multiple streams on a single camera. but I have only one stream that I am using. (depth image) and that is a few seconds behind. so if i hold my hand in front of teh camera, it shows up only after a few seconds. Please if I misunderstood correct me.

Thank you

Best regards

Igor

erik · Sep 12, 2022

Hi IgorMasin ,
My bad, thanks for elaborating. I have tried running your code and it works as expected, maybe 300ms delay. Are you using working USB3 cable?
Thanks, Erik

IgorMasin · Sep 12, 2022

Nope, I am using a POE network cable...

erik · Sep 13, 2022

Hi IgorMasin ,
That is likely the reason - PoE has much lower throughput, and you are also likely getting 100% CPU utilization (due to additional consumption of streaming via poe). You can check cpu usage following these docs. If that's the case, I would suggest lowering resolution/fps and/or encoding frames. Thoughts?
Thanks, Erik

IgorMasin · Sep 14, 2022

Thank you!

debugging outputs this:
[1844301011697B0E00] [169.254.1.222] [61.330] [system] [info] Memory Usage - DDR: 110.49 / 340.93 MiB, CMX: 2.11 / 2.50 MiB, LeonOS Heap: 54.00 / 77.58 MiB, LeonRT Heap: 4.38 / 41.37 MiB
[1844301011697B0E00] [169.254.1.222] [61.330] [system] [info] Temperatures - Average: 36.30 °C, CSS: 37.71 °C, MSS 35.35 °C, UPA: 36.30 °C, DSS: 35.83 °C
[1844301011697B0E00] [169.254.1.222] [61.330] [system] [info] Cpu Usage - LeonOS 39.91%, LeonRT: 1.87%

so there seems to be enough cpu capability left. Lowering the resolution decreases the lag considerably, but I still cannot figure out one thing: why is there this delay in the first place? I tried to replace the While loop with a button press, so it only grabs one image when i press a button and has plenty of time to transfer it over. it still lags X frames behind. so if i take one image every 10 seconds for example by pressing the button, i will be presented with images that are 1.5 minutes old. Is there a way to take a single picuture, and send it over right away?

Please excuse my stupid questions, but its the first time i see this kind of problems and have no clue where to start solving it. also, if it is a transfer speed related problem, it would either drop the FPS oder the delay would get greater and greater until the buffer is maxed out. those "15 frames inbetween" must be somewhere..

Could it be, that on your end the problem exists also, but the 15 frames delay are in a much shorter timerange due to your FPS so you dont notice it?

erik · Sep 14, 2022

Hi IgorMasin ,
It's likely due to bandwidth limitation from the POE. You could try setting xlink chunk size to 0 (mentioned here). You could send a message from host to camera (script node, example here) which would then forward (example here) one depth frame to the host. Thoughts?
Thanks, Erik