I am using this custom decode function on the NNdata coming from the NeuralNetwork node. The other values seem to come out correctly but the bounding box values range from small negative numbers to around 16. When this gets normalized later in the pipeline this causes the bounding box to be outside the bounds of the image, as it is expecting the values to be between 0 and 1. Using the YoloDetectionNetwork node seemed to interpret the bounding boxes correctly so I assumed the issue was with my decoding function rather than the model itself. I am using the non_max_suppression method from ultralytics. Below the function are some of the box coordinates that were assigned to the detection's xmin, ymin, xmax, ymax fields.
def
decode(
nn_data
: dai.NNData):
layer =
nn_data
.getFirstLayerFp16()
res_np = np.array(layer).reshape((1, 14, -1))
res = torch.tensor(res_np)
results = non_max_suppression(res,
conf_thres
=0.25,
iou_thres
=0.5,
classes
=None,
agnostic
=False,
multi_label
=False,
labels
=(),
max_det
=300,
nc
=9,
max_time_img
=0.05,
max_nms
=30000,
max_wh
=640,
in_place
=True,
rotated
=False)
dets = Detections(
nn_data
)
r = results[0]
if r.numel() > 0:
for result in r:
x_min = result[0].item()
y_min = result[1].item()
x_max = result[2].item()
y_max = result[3].item()
conf = result[4].item()
label = int(result[5].item())
det = Detection(None, label, conf, x_min, y_min, x_max, y_max)
dets.detections.append(det)
return dets
[0.49, 1.82, 3.34, 3.10]
[2.07, 1.92, 3.85, 3.17]
[0.46, 0.32, 3.35, 2.65]
[2.02, 0.42, 3.82, 2.69]
[2.05, -1.13, 3.84, 2.14]
[3.47, 0.43, 4.37, 2.77]
[3.50, 2.11, 4.37, 3.38]