• DepthAI
  • Support for yolov8 instance segmentation models along with depth information.

HI @u111s

import cv2
import numpy as np
import depthai as dai
import time
from YOLOSeg import YOLOSeg

pathYoloBlob = "./yolov8n-seg.blob"

# Create OAK-D pipeline
pipeline = dai.Pipeline()

# Setup color camera
cam_rgb = pipeline.createColorCamera()
cam_rgb.setPreviewSize(640, 640)
cam_rgb.setInterleaved(False)

# Setup depth
stereo = pipeline.createStereoDepth()
left = pipeline.createMonoCamera()
right = pipeline.createMonoCamera()

left.setBoardSocket(dai.CameraBoardSocket.LEFT)
right.setBoardSocket(dai.CameraBoardSocket.RIGHT)
stereo.setConfidenceThreshold(255)

left.out.link(stereo.left)
right.out.link(stereo.right)

# Setup neural network
nn = pipeline.createNeuralNetwork()
nn.setBlobPath(pathYoloBlob)
cam_rgb.preview.link(nn.input)

# Setup output streams
xout_rgb = pipeline.createXLinkOut()
xout_rgb.setStreamName("rgb")
cam_rgb.preview.link(xout_rgb.input)

xout_nn_yolo = pipeline.createXLinkOut()
xout_nn_yolo.setStreamName("nn_yolo")
nn.out.link(xout_nn_yolo.input)

xout_depth = pipeline.createXLinkOut()
xout_depth.setStreamName("depth")
stereo.depth.link(xout_depth.input)

# Start application
with dai.Device(pipeline) as device:

    q_rgb = device.getOutputQueue("rgb")
    q_nn_yolo = device.getOutputQueue("nn_yolo")
    q_depth = device.getOutputQueue("depth", maxSize=4, blocking=False)

    while True:
        in_rgb = q_rgb.tryGet()
        in_nn_yolo = q_nn_yolo.tryGet()
        in_depth = q_depth.tryGet()

        if in_rgb is not None:
            frame = in_rgb.getCvFrame()
            depth_frame = in_depth.getFrame() if in_depth is not None else None

            if in_nn_yolo is not None:
                # Assuming you have the segmented output and depth frame
                # You can now overlay segmentation mask on the depth frame or calculate depth for segmented objects

                # Placeholder for YOLOSeg processing
                # (Your existing code to obtain combined_img)

                if depth_frame is not None:
                    # Assuming the depth map and color frames are aligned
                    # You can fetch depth for specific objects here
                    # For example, fetching depth at the center of an object detected by YOLO:
                    for obj in detected_objects:  # Assuming detected_objects are obtained from YOLOSeg
                        x_center = obj["x_center"]
                        y_center = obj["y_center"]
                        depth = depth_frame[y_center, x_center]
                        print(f"Depth at center of object: {depth} mm")

                cv2.imshow("Output", combined_img)
                
            else:
                print("in_nn_yolo EMPTY")

        else:
            print("in_rgb EMPTY")

        # Exit logic
        if cv2.waitKey(1) == ord('q'):
            break
    5 days later
    8 days later

    Hii!!!, I am trying to run my custom data with two classes but the segmentation is terrible compared to the original one from yolo8 before transforming the file .pt to .blob. I have followed the steps discussed in this blog and tried with different image sizes.

    Any thoughts? thanks.

    @jakaskerl

    I mean, the model basically doesn't segment the objects accurately (it almost doesn't detect them). I have tried the same model, without the oak, in .onxx format and it works correctly, maybe the problem is when I transform it into .blob.

    However, I have tried to follow the same steps with the model "yolov8n-seg.pt" and the segmentation does not give me any problem.

      DavidMeiraPliego

      We are looking to add native support for instance segmentation to DepthAI, so we will be able to take a better look at the issue then.

      In the meantime:

      1. Do you follow the same steps to create the blob, including passing exactly the same flags?
      2. If yes, there are a lot of reasons something could go wrong. The first thing I would do is take a look at the confidence thresholds in Pytorch and the script you use with the camera. Do they match or is one higher than the other?

        Matija

        Yes, i followed the same steps to create the .blob and the thresholds match. I've been trying to modify it in the script but the result is the same.

        Do you have an approximate date for the implementation of instance segmentation to DepthA?

          Have you exported it for the same input shape? Does it help if you reduce the thresholds

          DavidMeiraPliego