Inquiry About SpatialDetectionNetwork Implementation

SriSwethaBA

I am reaching out to seek clarification regarding the implementation of the SpatialDetectionNetwork (specifically YoloSpatialDetectionNetwork) in the DepthAI ecosystem. In my current setup, I am using a DepthAI pipeline with mono cameras and stereo depth to detect objects and calculate their 3D coordinates, similar to examples provided in your documentation. However, I am encountering challenges in understanding how the SpatialDetectionNetwork processes bounding box data and depth information internally to produce accurate x, y, z coordinates for detected objects .

Could you please clarify whether the SpatialDetectionNetwork implementation, particularly the decoding and spatial calculation logic, is handled in firmware on the OAK device or if it is open-source and accessible in the DepthAI Python SDK or other repositories? If the implementation is part of the firmware, I would like to kindly request access to relevant documentation, code, or resources that would allow me to better understand and potentially customize the spatial calculations for my cotton picker project.

jakaskerl

SriSwethaBA
https://docs.luxonis.com/software/depthai-components/nodes/yolo_spatial_detection_network/#YoloSpatialDetectionNetwork-Configuring%20Spatial%20Detection

SpatialDetectionNetwork is a combination of Detection network, followed by a Spatial location calculator node.
The bounding boxes for each detected object are passed from SDN to SLC. Then some smaller BBOX is taken inside the detection BBOX (as configured by bboxScalingFactor) and then the pixels inside the smaller BBOX are averaged to produce a steady depth info.
X and Y are calculated from FOV + depth as done here - luxonis/oak-examplestree/master/gen2-calc-spatials-on-host

Thanks,
Jaka