ImageManip node adds padding when rotating images

rohan

I'm using an ImageManip node to rotate an image frame. I'm getting the original image from Camera node, which gives an output of width 1280 and height 800. The documentation says that rotation is possible only when the width is multiple of 16. This is true for the resolution I'm using.

When I try to rotate the frame by 90 degrees, the resultant frame is of width 896 and height 1280. The output frame has 96px padding along the width. Is this a result of optimizing image manipulation at hardware level? I saw this change in output dimensions on OAK-4 devices, but not on OAK-D-Pro or OAK-FFC-4P.

Due to the change in width, I'm getting the following error from StereoDepth node:

[StereoDepth(6)] [error] Node threw exception, stopping the node. Exception message: Input stride '896' not being equal to width '800' is not supported! Please use widths divisible by 128 (1280, 640, ...)

I tried to set the output size to 1280x800 using ImageManipConfig, but it didn't work. I'm sharing a minimal example below. I appreciate any help in resolving this error.

Thank you.

import depthai as dai
import numpy as np
import cv2
from datetime import timedelta

ROTATE_ANGLE = 90.0
DEFAULT_FPS = 15

COLOR_WIDTH = 1920
COLOR_HEIGHT = 1200
COLOR_OUT_TYPE = dai.ImgFrame.Type.NV12

MONO_WIDTH = 1280
MONO_HEIGHT = 800
MONO_OUT_TYPE = dai.ImgFrame.Type.GRAY8

pipeline = dai.Pipeline()

colorNode = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_A)
colorOut = colorNode.requestOutput((COLOR_WIDTH, COLOR_HEIGHT), fps = DEFAULT_FPS, type=COLOR_OUT_TYPE)

colorManipNode = pipeline.create(dai.node.ImageManip)
colorManipNode.setRunOnHost(False)
colorManipNode.setMaxOutputFrameSize(COLOR_WIDTH * COLOR_HEIGHT * 3)
colorManipNode.initialConfig.setFrameType(COLOR_OUT_TYPE)
colorManipNode.initialConfig.addRotateDeg(ROTATE_ANGLE)
colorOut.link(colorManipNode.inputImage)

monoLeft = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_B)
monoLeftOut = monoLeft.requestOutput((MONO_WIDTH, MONO_HEIGHT), fps=DEFAULT_FPS, type=MONO_OUT_TYPE)

monoLeftManip = pipeline.create(dai.node.ImageManip)
monoLeftManip.setRunOnHost(False)
monoLeftManip.setMaxOutputFrameSize(MONO_WIDTH * MONO_HEIGHT * 3)
monoLeftManip.initialConfig.setFrameType(MONO_OUT_TYPE)
monoLeftManip.initialConfig.addRotateDeg(ROTATE_ANGLE)
monoLeftManip.initialConfig.setOutputSize(MONO_HEIGHT, MONO_WIDTH, dai.ImageManipConfig.ResizeMode.CENTER_CROP)
monoLeftOut.link(monoLeftManip.inputImage)

monoRight = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_C)
monoRightOut = monoRight.requestOutput((MONO_WIDTH, MONO_HEIGHT), fps=DEFAULT_FPS, type=MONO_OUT_TYPE)

monoRightManip = pipeline.create(dai.node.ImageManip)
monoRightManip.setRunOnHost(False)
monoRightManip.setMaxOutputFrameSize(MONO_WIDTH * MONO_HEIGHT * 3)
monoRightManip.initialConfig.setFrameType(MONO_OUT_TYPE)
monoRightManip.initialConfig.addRotateDeg(ROTATE_ANGLE)
monoRightManip.initialConfig.setOutputSize(MONO_HEIGHT, MONO_WIDTH, dai.ImageManipConfig.ResizeMode.CENTER_CROP)
monoRightOut.link(monoRightManip.inputImage)

stereoNode = pipeline.create(dai.node.StereoDepth)
monoLeftManip.out.link(stereoNode.left)
monoRightManip.out.link(stereoNode.right)

sync = pipeline.create(dai.node.Sync)
sync.setSyncThreshold(timedelta(milliseconds=80))
colorManipNode.out.link(sync.inputs["color"])
stereoNode.depth.link(sync.inputs["depth"])
syncQueue = sync.out.createOutputQueue()


with pipeline:
    pipeline.start()

    cv2.namedWindow("color", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("color", 640, 400)
    cv2.namedWindow("depth", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("depth", 640, 400)

    while pipeline.isRunning():
        syncGroup = syncQueue.tryGet()
        if syncGroup is None:
            continue
        
        assert isinstance(syncGroup, dai.MessageGroup)
        if syncGroup.getNumMessages() == 0:
            continue

        print(f"MSG Group timestamp: {syncGroup.getTimestamp()}")
        for name, payload in syncGroup:
            assert isinstance(payload, dai.ImgFrame)
            print(f"\t[{name}] size: [{payload.getWidth()} x {payload.getHeight()}]")

            if name == "color":
                cv2.imshow(name, payload.getCvFrame())
            elif name == "depth":
                depthMap = payload.getFrame()
                normalizedMap = (depthMap * (255 / stereoNode.initialConfig.getMaxDisparity())).astype(np.uint8)
                colorizedDepth = cv2.applyColorMap(normalizedMap, cv2.COLORMAP_JET)
                cv2.imshow(name, colorizedDepth)

        key = cv2.waitKey(1)
        if key == ord('q'):
            pipeline.stop()
            break

OskarSonc

Hey @rohan,
OAK‑4 (RVC4) rotates via a hardware block that aligns line stride to 128 bytes. After a 90° rotate, the mono width becomes 800 but the buffer stride becomes 896 (next 128‑byte multiple). StereoDepth on v3 requires stride == width and widths divisible by 128, so the rotated 800×1280 input (width 800) is rejected. On RVC2 (e.g., OAK‑D‑Pro/FFC‑4P) this padding doesn’t occur, which is why you didn’t see it there. Attached is an updated script: StereoDepth receives the native 1280×800 mono frames (valid and calibrated), and the depth image is rotated on the host for visualization.

import depthai as dai
import numpy as np
import cv2
from datetime import timedelta

ROTATE_ANGLE = 90.0
DEFAULT_FPS = 15

COLOR_WIDTH = 1920
COLOR_HEIGHT = 1200
COLOR_OUT_TYPE = dai.ImgFrame.Type.NV12

MONO_WIDTH = 1280
MONO_HEIGHT = 800
MONO_OUT_TYPE = dai.ImgFrame.Type.GRAY8

pipeline = dai.Pipeline()

colorNode = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_A)
colorOut = colorNode.requestOutput((COLOR_WIDTH, COLOR_HEIGHT), fps = DEFAULT_FPS, type=COLOR_OUT_TYPE)

colorManipNode = pipeline.create(dai.node.ImageManip)
colorManipNode.setRunOnHost(False)
colorManipNode.setMaxOutputFrameSize(COLOR_WIDTH * COLOR_HEIGHT * 3)
colorManipNode.initialConfig.setFrameType(COLOR_OUT_TYPE)
colorManipNode.initialConfig.addRotateDeg(ROTATE_ANGLE)
colorOut.link(colorManipNode.inputImage)

monoLeft = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_B)
monoLeftOut = monoLeft.requestOutput((MONO_WIDTH, MONO_HEIGHT), fps=DEFAULT_FPS, type=MONO_OUT_TYPE)


monoRight = pipeline.create(dai.node.Camera).build(boardSocket=dai.CameraBoardSocket.CAM_C)
monoRightOut = monoRight.requestOutput((MONO_WIDTH, MONO_HEIGHT), fps=DEFAULT_FPS, type=MONO_OUT_TYPE)


stereoNode = pipeline.create(dai.node.StereoDepth)
monoLeftOut.link(stereoNode.left)
monoRightOut.link(stereoNode.right)

sync = pipeline.create(dai.node.Sync)
sync.setSyncThreshold(timedelta(milliseconds=80))
colorManipNode.out.link(sync.inputs["color"])
stereoNode.depth.link(sync.inputs["depth"])
syncQueue = sync.out.createOutputQueue()


with pipeline:
    pipeline.start()

    cv2.namedWindow("color", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("color", 640, 400)
    cv2.namedWindow("depth", cv2.WINDOW_NORMAL)
    cv2.resizeWindow("depth", 640, 400)

    while pipeline.isRunning():
        syncGroup = syncQueue.tryGet()
        if syncGroup is None:
            continue
        
        assert isinstance(syncGroup, dai.MessageGroup)
        if syncGroup.getNumMessages() == 0:
            continue

        print(f"MSG Group timestamp: {syncGroup.getTimestamp()}")
        for name, payload in syncGroup:
            assert isinstance(payload, dai.ImgFrame)
            print(f"\t[{name}] size: [{payload.getWidth()} x {payload.getHeight()}]")

            if name == "color":
                cv2.imshow(name, payload.getCvFrame())
            elif name == "depth":
                depthMap = payload.getFrame()
                normalizedMap = (depthMap * (255 / stereoNode.initialConfig.getMaxDisparity())).astype(np.uint8)
                colorizedDepth = cv2.applyColorMap(normalizedMap, cv2.COLORMAP_JET)
                displayDepth = cv2.rotate(colorizedDepth, cv2.ROTATE_90_CLOCKWISE)
                cv2.imshow(name, displayDepth)

        key = cv2.waitKey(1)
        if key == ord('q'):
            pipeline.stop()
            break

rohan

Hey OskarSonc oskarlsson

Thank you for a quick response explaining the cause of this issue. The updated script you provided would help if the rotation is only relevant outside the pipeline. Is there a way to rotate the image within the DepthAI pipeline and work with it?

I want to generalize my code so that it would work on RVC2 and RVC4 devices. I have ImageManip nodes in my pipeline that I need on OAK-FFC-4P devices. The rotation is necessary due design choices of the orientation of the individual cameras on the OAK-FFC-4P modules. In case of OAK-4, although all cameras on the device have same orientation, I want to handle all possible cases in my code for completeness.

OskarSonc

Hmm okay I see ...
Maybe you could try putting it in pipeline, but not before StereoDepth. so:

Feed StereoDepth with native, unrotated mono frames.
Branch rotated copies for everything else that needs portrait orientation.
Rotate the depth (and/or rectified left/right) after StereoDepth for visualization or downstream nodes.

rohan

Hey @OskarSonc
This would work usually, but the issue is that I'm aligning the depth with camera camera in CAM_A slot. I need to make sure that the outputs from color camera and the mono camera pair are oriented in the same way before passing them to the StereoDepth node. If the color cam is oriented for landscape, and the mono cameras are oriented for portrait, I would have to rotate either the color image or the mono images before passing them to the StereoDepth. This is why I must have ImageManip nodes before SteroDepth. I hope I'm not missing anything here…

I tried to set the output size on ImageManip node, but it doesn't seem to work (…not sure why). Maybe adding another ImageMnip node to resize/crop the image would work, but I've haven't tried it yet since these nodes are causing a lag (I mentioned about it in this post). So I'm trying to find a solution without compromising efficiency.

rohan

I just realized something…On RVC4 devices, the alignment is done on ImageAlign node, not the StereoDepth node. So, I can add an ImageManip node after StereoDepth, and pass on the rotated image to ImageAlign for alignment. I think this would solve the problem…

OskarSonc

hmmm you could try that but, afaik ImageAlignexpects depth in the left-camera’s original, calibrated orientation. If you rotate depth before it, alignment is wrong...

rohan

OskarSonc

Yeah, I tried it and it's not working. I got a blank depth map (no depth was estimated anywhere in the frame).

OskarSonc

rohan
that makes sense... I think for now the only way, is the way we disused before...