Issue with Decimation Filter

rohan

Hello,

I'm trying to get a depth map using cameras run on OAK FFC 4P. I want to align the depth frame to the Right cam (in socket CAM_C). I was testing the difference between setting the alignment using inputAlignTo property and setDepthAlign method from StereoDepth node.

I noticed that when I align using inputAlignTo and enable Decimation filter, the depth map is downscaled, but the output frame size is not changed. So, the depth map of the whole scene is fitted within 1/4 of the output frame. The rest 3/4 of the frame is empty. However, when I use setDepthAlign to set alignment, the Decimation filter is working fine.

I'm sharing an MRE to visualize the difference. Please shed some light on how setDepthAlign and inputAlignTo are different-

Is one more optimized than the other? Do they perform depth alignment on the device or host?
Is there any advantage of using one over the other on RVC2 devices?

import depthai as dai
import cv2
import numpy as np
import time

LEFT_SOCKET = dai.CameraBoardSocket.CAM_B
RIGHT_SOCKET = dai.CameraBoardSocket.CAM_C

def colorizeDepth(frameDepth):
    invalidMask = frameDepth == 0
    # Log the depth, minDepth and maxDepth
    try:
        minDepth = np.percentile(frameDepth[frameDepth != 0], 3)
        maxDepth = np.percentile(frameDepth[frameDepth != 0], 95)
        logDepth = np.zeros_like(frameDepth, dtype=np.float32)
        np.log(frameDepth, where=frameDepth != 0, out=logDepth)
        logMinDepth = np.log(minDepth)
        logMaxDepth = np.log(maxDepth)
        np.nan_to_num(logDepth, copy=False, nan=logMinDepth)
        # Clip the values to be in the 0-255 range
        logDepth = np.clip(logDepth, logMinDepth, logMaxDepth)

        # Interpolate only valid logDepth values, setting the rest based on the mask
        depthFrameColor = np.interp(logDepth, (logMinDepth, logMaxDepth), (0, 255))
        depthFrameColor = np.nan_to_num(depthFrameColor)
        depthFrameColor = depthFrameColor.astype(np.uint8)
        depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)
        # Set invalid depth pixels to black
        depthFrameColor[invalidMask] = 0
    except IndexError:
        # Frame is likely empty
        depthFrameColor = np.zeros((frameDepth.shape[0], frameDepth.shape[1], 3), dtype=np.uint8)
    except Exception as e:
        raise e
    return depthFrameColor


with dai.Pipeline() as pipeline:
    monoLeft = pipeline.create(dai.node.Camera).build(LEFT_SOCKET)
    monoRight = pipeline.create(dai.node.Camera).build(RIGHT_SOCKET)
    stereo = pipeline.create(dai.node.StereoDepth)

    monoLeftOut = monoLeft.requestOutput(size=(1280, 800))
    monoRightOut = monoRight.requestOutput(size=(1280, 800), enableUndistortion=True)

    monoLeftOut.link(stereo.left)
    monoRightOut.link(stereo.right)

    monoRightOut.link(stereo.inputAlignTo)  # Gives incorrect depth output
    # stereo.setDepthAlign(RIGHT_SOCKET)
    stereo.setOutputSize(1280, 800)

    stereo.initialConfig.postProcessing.decimationFilter.decimationFactor = 2
    stereo.initialConfig.postProcessing.decimationFilter.decimationMode = dai.StereoDepthConfig.PostProcessing.DecimationFilter.DecimationMode.PIXEL_SKIPPING

    rightOut = monoRightOut.createOutputQueue()
    stereoOut = stereo.depth.createOutputQueue()
    config_out = stereo.outConfig.createOutputQueue()

    cv2.namedWindow("Right", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("Right", 780, 780)
    cv2.namedWindow("Depth", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("Depth", 780, 780)

    pipeline.start()
    while pipeline.isRunning():
        time.sleep(5 / 1000)
        rightFrame = rightOut.tryGet()
        stereoFrame = stereoOut.tryGet()
        stereo_config = config_out.tryGet()

        if stereo_config is not None:
            print(f"Decimation Factor: [{stereo_config.postProcessing.decimationFilter.decimationFactor}]")
            print(f"Decimation Mode: [{stereo_config.postProcessing.decimationFilter.decimationMode.name}]")
            print()

        if rightFrame is None or stereoFrame is None:
            continue

        right = cv2.rotate(rightFrame.getCvFrame(), cv2.ROTATE_90_CLOCKWISE)
        depth = cv2.rotate(stereoFrame.getFrame(), cv2.ROTATE_90_CLOCKWISE)
        depth = colorizeDepth(depth)

        cv2.imshow("Right", right)
        cv2.imshow("Depth", depth)

        if cv2.waitKey(1) == ord('q'):
            cv2.destroyAllWindows()
            break

    pipeline.stop()

Thank you!

Regards,
Rohan

OskarSonc

Hey rohan - in DepthAI v3, both stereo.setDepthAlign(...) and stereo.inputAlignTo are device-side alignment paths on StereoDepth. The difference is that setDepthAlign(CAM_X) is a static socket-based alignment, while inputAlignTo uses the metadata of the linked ImgFrame, so it is more flexible if you want to align to a specific processed stream.

For your case, if you just want depth aligned to CAM_C, stereo.setDepthAlign(RIGHT_SOCKET) is the simpler option.

The decimation behavior you described, where the depth gets squeezed into part of the frame while the output size stays the same, matches a known StereoDepth issue that afaik was already fixed. So the first thing I’d suggest is checking which depthai version you’re using and upgrading to the latest one.

Thanks,
Oskar

rohan

Hey @OskarSonc,

Thank your for the clarification. setDepthAlign() will do for my use case.

But to clarify, I'm using DepthAI v3.5.0. I was seeing an output like below, when I use inputAlignTo. So maybe the issue has not been fixed yet.

OskarSonc

Thanks rohan - will report to team.