Issue Summary
After upgrading to depthai=3.2.1, my existing face detection pipeline broke with the error:
RuntimeError: NNArchive should contain exactly one YOLO head. Found 0 YOLO heads.
This pipeline was working fine in previous versions of depthai V3.
Setup Details
depthai version: 3.2.1 (was working in earlier 3.1.0)
Model: face-detection-retail-0004 (SSD-based, not YOLO)
Node type: SpatialDetectionNetwork
Hardware: Oak-D (RVC2)
Code Context
I'm using the model as an RVC2 archive with this configuration:
FACE_ARC = FACE_ARC = MODULE_DIR / "face-detection-retail-0004.rvc2.tar.xz"
sdn = p.create(dai.node.SpatialDetectionNetwork)
sdn.setNNArchive(dai.NNArchive(FACE_ARC))
My config.json specifies an SSD parser, not YOLO:
{
"config_version": "1.0",
"model": {
"metadata": {
"name": "face-detection-retail-0004",
"path": "face-detection-retail-0004.blob",
"precision": "float32"
},
"inputs": [
{
"name": "data",
"dtype": "float32",
"input_type": "image",
"shape": [1, 3, 300, 300],
"layout": "NCHW",
"preprocessing": {
"mean": [0.0, 0.0, 0.0],
"scale": [1.0, 1.0, 1.0],
"reverse_channels": false,
"interleaved_to_planar": false
}
}
],
"outputs": [
{ "name": "detection_out", "dtype": "float32" }
],
"heads": [
{
"parser": "SSD",
"metadata": {
"classes": ["background","face"],
"n_classes": 1,
"iou_threshold": 0.5,
"conf_threshold": 0.5,
"max_det": 200,
"anchors": null
},
"outputs": ["detection_out"]
}
]
}
}
Questions
1. Is this an intentional breaking change?
The error message explicitly requires a YOLO head, but I didn't see anything related to this documented in the release notes.
2. How can I use newer models (YuNet, SCRFD) with SpatialDetectionNetwork?
I was previously advised to migrate to YuNet or SCRFD for better performance, but I hit a roadblock:
Problem: These models seem to require a NeuralNetwork node, not SpatialDetectionNetwork
Requirement: I need the spatial coordinates (X, Y, Z) that SpatialDetectionNetwork provides, not just 2D bounding boxes
Questions:
Can YuNet or SCRFD be configured to work with SpatialDetectionNetwork?
Do they need to be converted to a specific format (YOLO heads)?
Is there example config.json or conversion guidance for using these models with spatial detection?
Or is there a different approach to get spatial coordinates from a NeuralNetwork node?