Hello everyone, I am a beginner in learning depthai. I am trying to convert the segmentation script (part of the repo of depthai example and made in python)
https://github.com/luxonis/depthai-experiments/tree/master/gen2-road-segmentation into a C++ version.
Trying to put some pieces together, this is where I got so far and I hope someone can help.
Any suggestion - comment it's highly appreciated.
Thanks so much and I love working with depthAI!

`#include <chrono>
#include "depthai-core/examples/utility/utility.hpp"
#include <depthai/depthai.hpp>
#include "slar.hpp"

using namespace slar;
using namespace std;
using namespace std::chrono;

static std::atomic<bool> syncNN{true};


void slar_depth_segmentation::draw(cv::InputArray data, cv::OutputArray frame) {
    cv::addWeighted(frame, 1, data, 0.2, 0, frame);
}


//https://jclay.github.io/dev-journal/simple_cpp_argmax_argmin.html
cv::InputArray slar_depth_segmentation::decode(std::vector<std::uint8_t> data) {

    // reshape or np.squeeze
    data.resize(1, 1);
    // create a vector array
    std::vector<std::vector<int>> classColors{
            {0,   0,   0},
            {0,   255, 0},
            {255, 0,   0},
            {0,   0,   255}};

    double minVal;
    double maxVal;
    // https://shimat.github.io/opencvsharp_docs/html/ce0872e7-4375-0c51-1522-e85227787817.htm
    // find min and max argument of 1-d array
    cv::minMaxIdx(
            data,
            &minVal,
            &maxVal);
    // get max value of class colors
    auto output_colors = classColors[&maxVal, 0];
    return output_colors;
}


void slar_depth_segmentation::segment(int argc, char **argv, dai::Pipeline &pipeline,
                                      cv::Mat frame,
                                      dai::Device *device_unused) {
    // blob model
    std::string nnPath("/Users/alessiograncini/road-segmentation-adas-0001.blob");
    if (argc > 1) {
        nnPath = std::string(argv[1]);
    }
    printf("Using blob at path: %s\n", nnPath.c_str());

    // in
    auto camRgb = pipeline.create<dai::node::ColorCamera>();
    auto imageManip = pipeline.create<dai::node::ImageManip>();
    auto mobilenetDet = pipeline.create<dai::node::MobileNetDetectionNetwork>();
    // out
    auto xoutRgb = pipeline.create<dai::node::XLinkOut>();
    auto nnOut = pipeline.create<dai::node::XLinkOut>();
    auto xoutManip = pipeline.create<dai::node::XLinkOut>();
    // stream names
    xoutRgb->setStreamName("camera");
    xoutManip->setStreamName("manip");
    nnOut->setStreamName("segmentation");
    //
    imageManip->initialConfig.setResize(300, 300);
    imageManip->initialConfig.setFrameType(dai::ImgFrame::Type::BGR888p);

    // properties
    camRgb->setPreviewSize(300, 300);
    camRgb->setBoardSocket(dai::CameraBoardSocket::RGB);
    camRgb->setResolution(dai::ColorCameraProperties::SensorResolution::THE_1080_P);
    camRgb->setInterleaved(false);
    camRgb->setColorOrder(dai::ColorCameraProperties::ColorOrder::RGB);
    //
    mobilenetDet->setConfidenceThreshold(0.5f);
    mobilenetDet->setBlobPath(nnPath);
    mobilenetDet->setNumInferenceThreads(2);
    mobilenetDet->input.setBlocking(false);
    // link
    camRgb->preview.link(xoutRgb->input);
    imageManip->out.link(mobilenetDet->input);
    //
    if (syncNN) {
        mobilenetDet->passthrough.link(xoutManip->input);
    } else {
        imageManip->out.link(xoutManip->input);
    }
    //
    mobilenetDet->out.link(nnOut->input);
    // device
    dai::Device device(pipeline);

    // queues
    auto previewQueue = device.getOutputQueue("camera", 4, false);
    auto detectionNNQueue = device.getOutputQueue("segmentation", 4, false);

    // fps
    auto startTime = steady_clock::now();
    int counter = 0;
    float fps = 0;
    auto color = cv::Scalar(255, 255, 255);

    // main
    while (true) {
        auto inRgb = previewQueue->get<dai::ImgFrame>();
        auto inSeg = detectionNNQueue->get<dai::NNData>();
        //?
        auto segmentations = inSeg->getData();
        //
        counter++;
        auto currentTime = steady_clock::now();
        auto elapsed = duration_cast<duration<float>>(currentTime - startTime);
        if (elapsed > seconds(1)) {
            fps = counter / elapsed.count();
            counter = 0;
            startTime = currentTime;
        }

        auto seg = decode(segmentations);
        slar_depth_segmentation::draw(seg, frame);

        std::stringstream fpsStr;
        fpsStr << std::fixed << std::setprecision(2) << fps;
        cv::imshow("camera window", inRgb->getCvFrame());


        int key = cv::waitKey(1);
        if (key == 'q' || key == 'Q') {
            break;
        }
    }
}


void slar_depth_segmentation::segment2(int argc, char **argv, dai::Pipeline &pipeline,
                                       cv::Mat frame,
                                       dai::Device *device_unused) {
    using namespace std;
    using namespace std::chrono;
}

`

4 months later

Did you get any further with this? I'm having similar problems getting segmentation to work using C++ so if you managed to get it working I'd be very interested if you're able to share how you did it.

Thanks!

I think chatgpt (or similar model) can help with such conversion as well, I have used it for c++ -> python conversion and was quite impressed🙂 It even understands numpy library.

Awesome! I will take a look and definitely let you know.

Many thanks to both of you!

11 days later

So, having got Semantic Segmentation working with a pre-trained model I've been trying to work out how to create a model from a custom dataset.

So I've only managed to find a method using YOLO5 to do Instance Segmentation. This works - in that it creates a model. However, I don't see how I would use this with DepthAI. Does DepthAI even handle Instance Segmentation? I have it working with YOLO5 object detection models and separately with an OpenVino ADAS Semantic Segmentation model. However, I can't see how I would set it up to do YOLO5 Instance Segmentation.

Does anyone know if Instance Segmentation is actually supported?

Or, alternatively, does anyone know how to train a custom model to do Semantic Segmentation?

  • erik replied to this.

    Hi Malcs !
    Any kind of NN inferencing is supported (if layers are supported), but on-device decoding is only available for Yolo/SSD object detection models. So for instance segmentation, you would need to use NeuralNetwork node, and decode the NNData messages in the business logic yourself - just like with deeplab demo.

    For custom deeplab model, perhaps check this notebook.
    Thanks, Erik

    Hi Erik,

    Thanks for the reply. Understood. I have tried that and I can see the NNData being returned. However, I have no idea how to decipher this. I had the same issue with the road segmentation example until Alessio shared his example code.

    Do you happen to know if the format of the NNData for models such as these ones is ever documented anywhere?

    For reference, I am using a sign language AI model as custom data as shown in this example:
    YOLOv5 Instance Segmentation Tutorial

    I can train the model and generate a blob file. I can run inference using the Colab notebook and the trained model appears to work. I can use the generated blob file with the OAK camera and get NNData back but I have no idea how to decipher it.

    Any pointers would be a great help, many thanks!

    • erik replied to this.