Support for yolov8 instance segmentation models along with depth information.

jakaskerl · Mar 20, 2024

import cv2
import numpy as np
import depthai as dai
import time
from YOLOSeg import YOLOSeg

pathYoloBlob = "./yolov8n-seg.blob"

# Create OAK-D pipeline
pipeline = dai.Pipeline()

# Setup color camera
cam_rgb = pipeline.createColorCamera()
cam_rgb.setPreviewSize(640, 640)
cam_rgb.setInterleaved(False)

# Setup depth
stereo = pipeline.createStereoDepth()
left = pipeline.createMonoCamera()
right = pipeline.createMonoCamera()

left.setBoardSocket(dai.CameraBoardSocket.LEFT)
right.setBoardSocket(dai.CameraBoardSocket.RIGHT)
stereo.setConfidenceThreshold(255)

left.out.link(stereo.left)
right.out.link(stereo.right)

# Setup neural network
nn = pipeline.createNeuralNetwork()
nn.setBlobPath(pathYoloBlob)
cam_rgb.preview.link(nn.input)

# Setup output streams
xout_rgb = pipeline.createXLinkOut()
xout_rgb.setStreamName("rgb")
cam_rgb.preview.link(xout_rgb.input)

xout_nn_yolo = pipeline.createXLinkOut()
xout_nn_yolo.setStreamName("nn_yolo")
nn.out.link(xout_nn_yolo.input)

xout_depth = pipeline.createXLinkOut()
xout_depth.setStreamName("depth")
stereo.depth.link(xout_depth.input)

# Start application
with dai.Device(pipeline) as device:

    q_rgb = device.getOutputQueue("rgb")
    q_nn_yolo = device.getOutputQueue("nn_yolo")
    q_depth = device.getOutputQueue("depth", maxSize=4, blocking=False)

    while True:
        in_rgb = q_rgb.tryGet()
        in_nn_yolo = q_nn_yolo.tryGet()
        in_depth = q_depth.tryGet()

        if in_rgb is not None:
            frame = in_rgb.getCvFrame()
            depth_frame = in_depth.getFrame() if in_depth is not None else None

            if in_nn_yolo is not None:
                # Assuming you have the segmented output and depth frame
                # You can now overlay segmentation mask on the depth frame or calculate depth for segmented objects

                # Placeholder for YOLOSeg processing
                # (Your existing code to obtain combined_img)

                if depth_frame is not None:
                    # Assuming the depth map and color frames are aligned
                    # You can fetch depth for specific objects here
                    # For example, fetching depth at the center of an object detected by YOLO:
                    for obj in detected_objects:  # Assuming detected_objects are obtained from YOLOSeg
                        x_center = obj["x_center"]
                        y_center = obj["y_center"]
                        depth = depth_frame[y_center, x_center]
                        print(f"Depth at center of object: {depth} mm")

                cv2.imshow("Output", combined_img)
                
            else:
                print("in_nn_yolo EMPTY")

        else:
            print("in_rgb EMPTY")

        # Exit logic
        if cv2.waitKey(1) == ord('q'):
            break

Uu111s · Mar 21, 2024

jakaskerl

Thanks. Will try and let you know.

Uu111s · Mar 26, 2024

@jakaskerl @pedro-UCA

Thanks for your support. I have successfully merged all your code and now I can retrieve masks and also get depth in the specified region.

I have put the working code in the following repository. You can navigate others in need of this to this repo. Thanks

tirandazi/depthai-yolov8-segment

jakaskerl · Mar 26, 2024

@u111s
Awesome, thanks!

DavidMeiraPliego · Apr 3, 2024

Hii!!!, I am trying to run my custom data with two classes but the segmentation is terrible compared to the original one from yolo8 before transforming the file .pt to .blob. I have followed the steps discussed in this blog and tried with different image sizes.

Any thoughts? thanks.

jakaskerl · Apr 3, 2024

@DavidMeiraPliego
Terrible in what way? Could you add some screenshots?

DavidMeiraPliego · Apr 4, 2024

@jakaskerl

I mean, the model basically doesn't segment the objects accurately (it almost doesn't detect them). I have tried the same model, without the oak, in .onxx format and it works correctly, maybe the problem is when I transform it into .blob.

However, I have tried to follow the same steps with the model "yolov8n-seg.pt" and the segmentation does not give me any problem.

jakaskerl · Apr 4, 2024

cc @Matija

Matija · Apr 5, 2024

DavidMeiraPliego

We are looking to add native support for instance segmentation to DepthAI, so we will be able to take a better look at the issue then.

In the meantime:

Do you follow the same steps to create the blob, including passing exactly the same flags?
If yes, there are a lot of reasons something could go wrong. The first thing I would do is take a look at the confidence thresholds in Pytorch and the script you use with the camera. Do they match or is one higher than the other?

DavidMeiraPliego · Apr 5, 2024

Matija

Yes, i followed the same steps to create the .blob and the thresholds match. I've been trying to modify it in the script but the result is the same.

Do you have an approximate date for the implementation of instance segmentation to DepthA?

Matija · Apr 5, 2024

Have you exported it for the same input shape? Does it help if you reduce the thresholds

DavidMeiraPliego