Hi,

I am trying to use the cast node to display concatenated left and right mono camera images. When I run the "cast_concat.py" example, I can view the concatenated frames output as expected.

But if I just remove the RGB camera image and feed only the two mono images to the Pytorch frame concatenation NN model created using this example, I get the below error:

cv2.imshow("Concated frames", inCast.getCvFrame())

^^^^^^^^^^^^^^^^^^^

RuntimeError: ImgFrame doesn't have enough data to encode specified frame, required 1536000, actual 512000. Maybe metadataOnly transfer was made?

Please see all files for an MRE here. Please could you let me know what I am doing wrong?

Best,
AB

  • lovro replied to this.
  • abhanupr
    Hi,
    Since your left and right cameras are mono cameras, the concatenation should handle grayscale images instead of RGB. To fix this, you need to update two things:

    1. In your PyTorch script, change the channels to 1 for grayscale inputs:
    X = torch.ones((1, 1, 400, 640), dtype=torch.float16)
    2. In your pipeline, set the cast node to output RAW8:
    cast.setOutputFrameType(dai.ImgFrame.Type.RAW8)

    Hope this fixes the issue.

    abhanupr
    Hi,
    I think the issue with the input shape and the axis you’re concatenating on: (in pytorch_concat.py)
    - The input shape for the dummy tensor should be (1, 3, 400, 640) (batch size, channels, height, width):
    X = torch.ones((1, 3, 400, 640), dtype=torch.float16)
    - You’re concatenating on the wrong axis. Use axis 3 (width) instead of 1 (channel):
    return torch.cat((img1, img2), 3)

    This should fix the problem!

      lovro

      Thanks for the reply. That makes a lot of sense.

      I tried those changes to the pytorch code. Now I get concatenated images that flicker and have some lines and shadows, as can be seen in the below screenshot. Also, the program runs for a few seconds and then suddenly freezes. Any ideas how to solve these issues?

      AB

        abhanupr
        Hi,
        Since your left and right cameras are mono cameras, the concatenation should handle grayscale images instead of RGB. To fix this, you need to update two things:

        1. In your PyTorch script, change the channels to 1 for grayscale inputs:
        X = torch.ones((1, 1, 400, 640), dtype=torch.float16)
        2. In your pipeline, set the cast node to output RAW8:
        cast.setOutputFrameType(dai.ImgFrame.Type.RAW8)

        Hope this fixes the issue.