OAK-D SR simultaneous depth and RGB

pete · May 24, 2023

Hi team, we have just been playing around with the new OAK-D SR as we love its form factor and feature set.

Our use case is that we would like to be able to run detection over an RGB frame from the camera (with our own model on a host machine - not on the camera at this stage) and then use the depth/disparity frame from the same camera for object localisation (again, on a host machine at this stage).

I have been playing around with the driver demos and examples extracting RGB and depth data individually, but have been experiencing problems trying to retrieve RGB and depth frames simultaneously.

Which led me to a bit of a realisation - the OAK-D SR only has the two sensors (which individually can be set as mono or RGB output). Is it actually possible to output both RGB and depth data at the same time from the same camera with the SR?

Cheers, Pete

jakaskerl · May 24, 2023

Hi pete
I haven't tested it, but it should work the same as with other oak devices. Like you do with e.g. OAK-D: create left and right mono cameras and link them through the StereoDepth node and connect depth to Xlink. Also link left (or right) mono directly to the Xlink. This should give you both left mono feed as well as depth feed.
The process for SR would AFAIK be the same, just with color cameras instead of mono.

Hope this helps,
Jaka

pete · May 25, 2023

Hmm, interesting, maybe I am missing something? The set up I am using is the following:

Create left, right rgb and depth pipelines:
autorgb_left_pipeline = pipeline.create<dai::node::ColorCamera>();
autorgb_right_pipeline = pipeline.create<dai::node::ColorCamera>();
autodepth_pipeline = pipeline.create<dai::node::StereoDepth>();

Create XLink out pipelines:
autorgb_xlink_out = pipeline.create<dai::node::XLinkOut>();
autodepth_xlink_out = pipeline.create<dai::node::XLinkOut>();

Set some basic properties, stream name, resolution etc:
rgb_xlink_out->setStreamName("rgb");
depth_xlink_out->setStreamName("depth");
autocam_res = dai::ColorCameraProperties::SensorResolution::THE_800_P;

rgb_left_pipeline->setCamera("left"); rgb_left_pipeline->setResolution(cam_res);
rgb_left_pipeline->setFps(10);
rgb_right_pipeline->setCamera("right"); rgb_right_pipeline->setResolution(cam_res);
rgb_right_pipeline->setFps(10);

…etc

Link nodes, create device and create queues:
rgb_left_pipeline->preview.link(depth_pipeline->left);
rgb_right_pipeline->preview.link(depth_pipeline->right);
depth_pipeline->disparity.link(depth_xlink_out->input);
rgb_left_pipeline->preview.link(rgb_xlink_out->input);
dai::Device device(pipeline, dai::UsbSpeed::SUPER); print_device_info(device);

auto rgb_queue = device.getOutputQueue("rgb", 4, false);
auto depth_queue = device.getOutputQueue("depth", 4, false);

(Note: I used rgb..pipeline->preview here in place of the "out" output, since the ColorCamera pipelines dont seem to have an "out" output equivalent?)

Finally, a really basic extraction of frames for visualisation:

    while(true) 
    {
        auto rgb_frame = rgb_queue->get<dai::ImgFrame>()->getFrame();
        auto depth_frame = depth_queue->get<dai::ImgFrame>()->getFrame();

        depth_frame.convertTo(depth_frame, CV_8UC1, 255 / depth_pipeline->initialConfig.getMaxDisparity());

        // Retrieve 'bgr' (opencv format) frame
        cv::imshow("rgb", rgb_frame);
        cv::imshow("depth", depth_frame);

        int key = cv::waitKey(1);
        if(key == 'q' || key == 'Q') {
            break;
        }
    }

Ok, so when running, I get this error:

[1844301041C04B1200] [3.3.1.1] [3.132] [StereoDepth(2)] [error] Left input image stride ('900') should be equal to its width ('300'). Skipping frame!

Which does makes sense - I am feeding a three channel (RGB) image matrix into something (stereo depth pipeline) that is presumably expecting a single channel image matrix.

However, having gone over the documentation, I am not sure what the proper approach is here for this use case - the preview output of the rgb_left and rgb_right pipelines seems a bit sus to me, but the problem seems to be with the stereo depth node expecting a specific image format as an input.

Is there some configuration option in the stereo_depth pipeline that I need to set? or should I be changing the output of the rgb pipelines (to somehow produce both RGB and mono matrices?

Cheers, Pete

jakaskerl · May 25, 2023

Hi pete
Strange behaviour; It should accept RGB..
Could you try using right.video instead of preview?

Thanks,
Jaka

pete · May 25, 2023

jakaskerl Changing from using the preview output to the video output seemed to do the trick! Thanks so much!!

Also, for anyone who uses this post in the future, when I access the frames from the camera, I am not accounting for null returns from the get call on the queue, you would probably actually want to retrieve your images like so:

    while(true) 
    {
        auto rgb_img_frame = rgb_queue->get<dai::ImgFrame>();
        if (rgb_img_frame != nullptr)
        {
            auto rgb_frame = rgb_img_frame->getFrame();
            cv::imshow("rgb", rgb_frame);
        }

        auto depth_img_frame = depth_queue->get<dai::ImgFrame>();
        if (depth_img_frame != nullptr)
        {
            auto depth_frame = depth_img_frame->getFrame();
            depth_frame.convertTo(depth_frame, CV_8UC1, 255 / depth_pipeline->initialConfig.getMaxDisparity());
            cv::imshow("depth", depth_frame);
        }

        int key = cv::waitKey(1);
        if(key == 'q' || key == 'Q') 
        {
            break;
        }
    }