Streaming RGB and Depth in sync

Eedinky · Aug 12, 2022

I'm running the oak d lite on a raspberry pi and hoping to sync the depth and rgb channels into the same preview and hopefully capture at some point. however i'm finding that there is a 2 second or so lag on the rgb channel while the depth is running what seems close to realtime. here is the python code i'm running if anyone has any suggestions? do could it be a lack of power going to the oak d? (currently running on 5v power through the pi)

from pathlib import Path

import blobconverter
import cv2
import depthai as dai
import numpy as np

# Closer-in minimum depth, disparity range is doubled (from 95 to 190):
extended_disparity = False
# Better accuracy for longer distance, fractional disparity 32-levels:
subpixel = False
# Better handling for occlusions:
lr_check = True


cv2.namedWindow("preview", cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty("preview",cv2.WND_PROP_FULLSCREEN,cv2.WINDOW_FULLSCREEN)


# Pipeline tells DepthAI what operations to perform when running - you define all of the resources used and flows here
pipeline = dai.Pipeline()


# Define sources and outputs
monoLeft = pipeline.create(dai.node.MonoCamera)
monoRight = pipeline.create(dai.node.MonoCamera)
depth = pipeline.create(dai.node.StereoDepth)
xout = pipeline.create(dai.node.XLinkOut)


# First, we want the Color camera as the output
cam_rgb = pipeline.createColorCamera()
cam_rgb.setPreviewSize(400, 400)  # 300x300 will be the preview frame size, available as 'preview' output of the node
cam_rgb.setInterleaved(False)


# XLinkOut is a "way out" from the device. Any data you want to transfer to host need to be send via XLink
xout_rgb = pipeline.createXLinkOut()
# For the rgb camera output, we want the XLink stream to be named "rgb"
xout_rgb.setStreamName("rgb")
# Linking camera preview to XLink input, so that the frames will be sent to host
cam_rgb.preview.link(xout_rgb.input)

# depth pipeline

xout.setStreamName("disparity")

# Properties
monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)

# Create a node that will produce the depth map (using disparity output as it's easier to visualize depth this way)
depth.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
# Options: MEDIAN_OFF, KERNEL_3x3, KERNEL_5x5, KERNEL_7x7 (default)
depth.initialConfig.setMedianFilter(dai.MedianFilter.KERNEL_7x7)
depth.setLeftRightCheck(lr_check)
depth.setExtendedDisparity(extended_disparity)
depth.setSubpixel(subpixel)

# Linking
monoLeft.out.link(depth.left)
monoRight.out.link(depth.right)
depth.disparity.link(xout.input)

# Pipeline is now finished, and we need to find an available device to run our pipeline
# we are using context manager here that will dispose the device after we stop using it
with dai.Device(pipeline) as device:
    # From this point, the Device will be in "running" mode and will start sending data via XLink

    # To consume the device results, we get two output queues from the device, with stream names we assigned earlier
    q_rgb = device.getOutputQueue("rgb")
    q_depth = device.getOutputQueue(name="disparity", maxSize=4, blocking=False)

    # Here, some of the default values are defined. Frame will be an image from "rgb" stream, detections will contain nn results
    frame = None
    dframe = None   
    detections = []




    # Main host-side application loop
    while True:
        # we try to fetch the data from nn/rgb queues. tryGet will return either the data packet or None if there isn't any
        in_rgb = q_rgb.tryGet()
        

        if in_rgb is not None:
            # If the packet from RGB camera is present, we're retrieving the frame in OpenCV format using getCvFrame
            frame = in_rgb.getCvFrame()

        if frame is not None:
            inDepth = q_depth.get()  # blocking call, will wait until a new data has arrived
            dframe = inDepth.getFrame()
            # Normalization for better visualization
            dframe = (dframe * (255 / depth.initialConfig.getMaxDisparity())).astype(np.uint8)

            # Split screen RGB and Depth
            grayImageBGRspace = cv2.cvtColor(dframe,cv2.COLOR_GRAY2BGR)
            depth_cropped_image = grayImageBGRspace[0:400, 100:497]
            rgb_cropped_image = frame[0:400, 22:377]
            imgHor = cv2.hconcat([frame,depth_cropped_image])
            cv2.imshow("preview", imgHor)


        # at any time, you can press "q" and exit the main loop, therefore exiting the program itself

        if cv2.waitKey(1) == ord('q'):
            print(rgb_cropped_image.shape)
            print(depth_cropped_image.shape)
            break

erik · Aug 15, 2022

Hi edinky ,
Looking at the code, your mechanism for "syncing" isn't exactly ideal, that's why I assume there is the delay. Try:

in_rgb = q_rgb.tryGet()
if in_rgb is not None:
    # If the packet from RGB camera is present, we're retrieving the frame in OpenCV format using getCvFrame
    frame = in_rgb.getCvFrame()
    cv2.imshow('no_delay', frame)

So to sync the two streams (depth, rgb) without too much delay, I would suggest checking these demos.
Thanks, Erik

Eedinky · Aug 15, 2022

thanks for the help erik!