• DepthAI-v2
  • Compatibility of YoloSpatialDetectionNetwork with Yolov8n Models

I'm currently exploring the use of YoloSpatialDetectionNetwork, which I understand supports up to Tiny-Yolo-v4 for on-device decoding of detections and spatial coordinates together. I'm keen on implementing the Yolov8n model, trained using the Ultralytics framework for higher accuracy/speed.

However, I've encountered an issue where the Yolov8n model's blob file seems incompatible with the YoloSpatialDetectionNetwork node, as direct integration does not function as expected. I was wondering if there are any plans to update the YoloSpatialDetectionNetwork node to support the more advanced Yolov8n models?

If not, has anyone in the community successfully integrated Yolov8n with the node or found a workaround for this compatibility issue? I was able to run the Yolov8n blob file manually with the NeuralNetwork node, but I'm unable to do on-device decoding and fetching of spatial co-ordinates which slows down the pipeline.

For the time being, I plan to use the less accurate Tiny-Yolov4 model, but ideally, I'd like to utilize the capabilities of Yolov8 models for improved performance and accuracy. I appreciate any insights or suggestions. Thanks in advance for your help!

    Hi suhailnajeeb
    The V8 detection models are supported by the YoloDetectionNetwork.

    suhailnajeeb but I'm unable to do on-device decoding and fetching of spatial co-ordinates which slows down the pipeline.

    Have you tried only using the YoloDetectionNetwork? How did you convert the model? Make sure you use https://tools.luxonis.com/.

    Thanks
    Jaka

    5 days later

    Hi @jakaskerl

    Thanks for sharing the link to the blob converter. I was previously using the blobconverter python module which might have been causing some issues. After successful conversion using the blobconverter link you provided, I am able to run inference on my custom yolo detector. However, there is an issue that persists.

    For my custom object detector, I utilised only 2 classes, which was trained and configured using a custom dataset/training pipeline where I used autodistil for the dataset and training. Upon conversion with the given blob converter, the bounding boxes are properly decoded, however, the detection labels and confidence scores are giving out erroneous outputs. Here part of my modified detection code and sample output for reference:

    Changes for the class labels:

    # Custom label texts / class maps
    labelMap = [
        "yellow cone", "blue cone"
    ]

    Custom displayFrame function:

        def displayFrame(name, frame):
            color = (255, 0, 0)
            for detection in detections:
                bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))
                # cv2.putText(frame, labelMap[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)                   # Commented out since these lines cause error
                # cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)       # Commented out since these lines cause error
                cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
                print(f"Detection Label: {detection.label}, Confidence Threshold: {detection.confidence*100} %")
    
            # Show the frame
            cv2.imshow(name, frame)

    Full code here

    Output Preview:

    Erroneous output:

    (env) envai4r@ai4r:~/cone-localizer$ python rgb_yolo_simple.py
    Detection Label: 2, Confidence Threshold: 552.734375 %
    Detection Label: 2, Confidence Threshold: 543.75 %
    qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in "/home/ai4r/cone-localizer/env/lib/python3.11/site-packages/cv2/qt/plugins"
    Detection Label: 2, Confidence Threshold: 596.09375 %
    Detection Label: 2, Confidence Threshold: 533.203125 %
    Detection Label: 2, Confidence Threshold: 605.078125 %
    Detection Label: 2, Confidence Threshold: 519.921875 %
    Detection Label: 2, Confidence Threshold: 603.90625 %
    Detection Label: 2, Confidence Threshold: 521.484375 %
    Detection Label: 2, Confidence Threshold: 607.03125 %
    Detection Label: 2, Confidence Threshold: 528.90625 %
    Detection Label: 77, Confidence Threshold: 181100.0 %
    Detection Label: 78, Confidence Threshold: 4162.5 %
    Detection Label: 52, Confidence Threshold: 3366400.0 %
    Detection Label: 34, Confidence Threshold: 5008000.0 %
    Detection Label: 57, Confidence Threshold: 13362.5 %
    Detection Label: 40, Confidence Threshold: 816.40625 %
    Detection Label: 15, Confidence Threshold: 777.734375 %
    Detection Label: 32, Confidence Threshold: 892.1875 %
    Detection Label: 7, Confidence Threshold: 821.09375 %
    Detection Label: 32, Confidence Threshold: 890.625 %
    Detection Label: 32, Confidence Threshold: 849.21875 %
    Detection Label: 32, Confidence Threshold: 897.65625 %
    Detection Label: 32, Confidence Threshold: 855.46875 %
    Detection Label: 68, Confidence Threshold: 1005.46875 %
    Detection Label: 32, Confidence Threshold: 850.78125 %
    Detection Label: 32, Confidence Threshold: 1046.875 %
    Detection Label: 32, Confidence Threshold: 1030.46875 %
    Detection Label: 50, Confidence Threshold: 5017600.0 %
    Detection Label: 32, Confidence Threshold: 973.4375 %
    Detection Label: 68, Confidence Threshold: 5840000.0 %

    I was expecting my Detection Label to be between 0-1 and the confidence threshold to be at a more reasonable range. The model was properly detecting when I was using the NeuralNetwork node using the blob converted byblobconverter + decoding on the computer.

    However, on-device decoding seems to be causing these issues when I am doing the luxonis blob converter.

    I have also tried training a yolov8n using the official ultralytics framework but that has yielded exactly the same results.

    Can you confirm if the luxonis blob converter supports custom models where the number of classes is different? Do you have a clue why this might be happening and/or if there is any workaround?

    Thanks in advance!

    Hi @suhailnajeeb

    detectionNetwork.setNumClasses(80)
    detectionNetwork.setCoordinateSize(4)
    detectionNetwork.setAnchors([10, 14, 23, 27, 37, 58, 81, 82, 135, 169, 344, 319])
    detectionNetwork.setAnchorMasks({"side26": [1, 2, 3], "side13": [3, 4, 5]})

    You don't need the bottom three for v8, and make sure to set numClasses to 2.

    Thanks,
    Jaka