SpatialDetectionNetwork Breaking Change in 3.2.1: SSD Models

joshbarclay

Issue Summary

After upgrading to depthai=3.2.1, my existing face detection pipeline broke with the error:

RuntimeError: NNArchive should contain exactly one YOLO head. Found 0 YOLO heads.

This pipeline was working fine in previous versions of depthai V3.

Setup Details

depthai version: 3.2.1 (was working in earlier 3.1.0)
Model: face-detection-retail-0004 (SSD-based, not YOLO)
Node type: SpatialDetectionNetwork
Hardware: Oak-D (RVC2)

Code Context

I'm using the model as an RVC2 archive with this configuration:

FACE_ARC = FACE_ARC = MODULE_DIR / "face-detection-retail-0004.rvc2.tar.xz"

sdn = p.create(dai.node.SpatialDetectionNetwork)
sdn.setNNArchive(dai.NNArchive(FACE_ARC))

My config.json specifies an SSD parser, not YOLO:

{
  "config_version": "1.0",
  "model": {
    "metadata": {
      "name": "face-detection-retail-0004",
      "path": "face-detection-retail-0004.blob",
      "precision": "float32"
    },
    "inputs": [
      {
        "name": "data",
        "dtype": "float32",
        "input_type": "image",
        "shape": [1, 3, 300, 300],
        "layout": "NCHW",
        "preprocessing": {
          "mean": [0.0, 0.0, 0.0],
          "scale": [1.0, 1.0, 1.0],
          "reverse_channels": false,
          "interleaved_to_planar": false
        }
      }
    ],
    "outputs": [
      { "name": "detection_out", "dtype": "float32" }
    ],
    "heads": [
      {
        "parser": "SSD",
        "metadata": {
          "classes": ["background","face"],
          "n_classes": 1,
          "iou_threshold": 0.5,
          "conf_threshold": 0.5,
          "max_det": 200,
          "anchors": null
        },
        "outputs": ["detection_out"]
      }
    ]
  }
}

Questions

1. Is this an intentional breaking change?

The error message explicitly requires a YOLO head, but I didn't see anything related to this documented in the release notes.

2. How can I use newer models (YuNet, SCRFD) with `SpatialDetectionNetwork`?

I was previously advised to migrate to YuNet or SCRFD for better performance, but I hit a roadblock:

Problem: These models seem to require a NeuralNetwork node, not SpatialDetectionNetwork
Requirement: I need the spatial coordinates (X, Y, Z) that SpatialDetectionNetwork provides, not just 2D bounding boxes

Questions:

Can YuNet or SCRFD be configured to work with SpatialDetectionNetwork?
Do they need to be converted to a specific format (YOLO heads)?
Is there example config.json or conversion guidance for using these models with spatial detection?
Or is there a different approach to get spatial coordinates from a NeuralNetwork node?

aljaz

Hi joshbarclay

Thanks for reporting the bug! In 3.2.0 we expanded the logic of (Spatial)DetectionNetwork and it looks like we were too stringent with checking NNArchive validity. I've already created a fix for it and it will be available with release 3.3.0. Until then you can use 3.1.0. Regarding your other questions:

How can I use newer models (YuNet, SCRFD) with SpatialDetectionNetwork?

With the current implementation of SpatialDetectionNetwork this is not possible. The only way to do so would be to create a NeuralNetwork node, link it to a custom node to create configs and link that node to SpatialLocationCalculator node. But I would suggest against this as we have an update to SpatialLocationCalculator that will allow you to directly link ImgDetections messages to the calculator.

Is there example config.json or conversion guidance for using these models with spatial detection?

Sadly no, as `(Spatial)DetectionNetwork` is designed to be only for YOLO and SSD models. All other models require you to use `NeuralNetwork` + some parser + custom logic to create configs for `SpatialLocationCalculator`. We are actively working on adding the parsers for as many models as possible and SpatialLocationCalculator update is already in review and will be in 3.3.0.

Or is there a different approach to get spatial coordinates from a NeuralNetwork node?

Also cannot be done as NeuralNetwork outputs the raw model results which need to be parsed into detections, segmentation, …

I will keep you updated when 3.3.0 is released with the improved features.

Thanks,

Aljaz

joshbarclay

Hi aljaz

I've already created a fix for it and it will be available with release 3.3.0

Thanks for the quick turn around!

We have an update to SpatialLocationCalculator that will allow you to directly link ImgDetections messages to the calculator.

Perfect, great to know that you guys are already working on it! This will be immensely helpful.

Thanks for answering all my questions. I'll be keeping an eye out for 3.3.0 🤠

Josh