Gold-YOLO Conversion and Deployment Problem

ChrisCoutureDelValle · Oct 3, 2023

Hello,

I have recently trained, converted, and exported the newest iteration of the YOLO family, Gold-YOLO (Nano).
The first step after training was onnx conversion with image size 640,640, batch size of 1, (etc. standard inputs).
Onnx Conversion:
!python /content/drive/MyDrive/Efficient-Computing/Detection/Gold-YOLO/deploy/ONNX/export_onnx.py --weights /content/drive/MyDrive/Efficient-Computing/Detection/Gold-YOLO/runs/train/gold_yolo-n/weights/best_ckpt.pt --device 0 --simplify --batch 1

The second step was to run it through the Luxonis blob converter with the following params
BlobConverter Params: --data_type=FP16 --mean_values=[127.5,127.5,127.5] --scale_values=[255,255,255] --reverse_input_channels With number of shaves = 6.

Finally I ran the .blob file through a standard depthai script for writing the bounding boxes to a frame with the following network specific settings. Anchors were extracted from the Gold-Yolo config.

detectionNetwork.setConfidenceThreshold(0.65)

detectionNetwork.setNumClasses(2)

detectionNetwork.setCoordinateSize(4)

detectionNetwork.setAnchors([10, 13, 19, 19, 33, 23, 30, 61, 59, 59, 59, 119, 116, 90, 185, 185, 373, 326])

detectionNetwork.setAnchorMasks({"side28":[0,1,2],"side14":[3,4,5],"side7":[6,7,8]})

detectionNetwork.setIouThreshold(0.5)

The output is either nothing, ie. I run the script and nothing happens or a frame appears with 50+ bounding boxes and the following output.

[14442C10511593CD00] [20.1.1] [97.899] [XLinkOut(3)] [error] Message has too much metadata (78323B) to serialize. Maximum is 51200B. Dropping message

Just to reiterate This is the newest Yolo, any help would be much appreciated!

jakaskerl · Oct 3, 2023

Hi ChrisCoutureDelValle
Not sure if the newest Yolo is supported yet. I believe the model has to be correctly exported to work directly.
cc @Matija

Thanks,
Jaka

Matija · Oct 4, 2023

ChrisCoutureDelValle

Hey,

Gold-YOLO is not supported yet. I have not checked the architecture, but if decoding is similar to v5-v8, you could modify your ONNX such that it will have an output node with the name that corresponds to the correct decoding. We output the raw outputs of heads, and then postprocess (decode and perform NMS) on device inside DepthAI. You can see how we set the exports for each version here: https://github.com/luxonis/tools.

If decoding differs, you should use the classic NeuralNetwork node and perform any required post-processing on the host.

ChrisCoutureDelValle · Oct 4, 2023

Matija From the Gold-YOLO paper "Implementation details. We followed the setup of YOLOv6-3.0 [29] use the same structure (except for neck) and training configurations. The backbone of the network was implemented with 8 the EfficientRep Backbone, while the head utilized the Efficient Decoupled Head."

Given the structure is the same except for the neck, I should be able to output the output_names=['output1_yolov6r2', 'output2_yolov6r2', 'output3_yolov6r2']?

Any available examples leveraging a NeuralNetwork node for OD or YOLO implementations.

Thanks for your response, much appreciated!!

-Chris

Matija · Oct 4, 2023

ChrisCoutureDelValle

There were a few slightly different versions of YoloV6, so not sure which head exactly is being referenced. We support all in the tools.luxonis.com. But yeah, in general, the flow should be:

Generate ONNX from .pt (we do replace the heads to only keep the outputs)
Set the output names to "output1_yolov6r2", …
Generate OpenVINO .xml and .bin and use blobconverter to convert to blob

You can see example for latest YoloV6 version here and for older version here (note the repo structure is a bit split up to easily avoid import issues).

Any available examples leveraging a NeuralNetwork node for OD or YOLO implementations.

Example of on-host decoding is here for V5. This was done before NMS was supported in the node itself so it's a bit of legacy code. In general, the post-processing would vary on how the model is exported. If you'll export it with the NMS itself, there likely won't be much need for post-processing. If you export it without NMS but include the decoding, you might only have to do it like in the example. If you prune it sooner, you would need to do also decoding. So, varies slightly.

ChrisCoutureDelValle · Oct 4, 2023

Matija Thanks for your quick response! This is the current onnx export for Gold-YOLO, do I simply replace the output names, or do I have to replace the heads to keep outputs, and if so if you could point me to where I do so, should be able to figure it out from there!

        `torch.onnx.export(model, img, f, verbose=False, opset_version=13,
                          training=torch.onnx.TrainingMode.EVAL,
                          do_constant_folding=True,
                          input_names=['images'],
                          output_names=['num_dets', 'det_boxes', 'det_scores', 'det_classes']
                          if args.end2end else ['outputs'],
                          dynamic_axes=dynamic_axes)`

Matija · Oct 4, 2023

ChrisCoutureDelValle

Given that you export it with the following nodes:

'num_dets', 'det_boxes', 'det_scores', 'det_classes'

I would say that NMS is already encoded in the ONNX. So you can keep those names, and just reshape the vectors to the same shape after they come out of OAK. If I were to guess, num_dets will contain info on how many bboxes are there, detboxes will have actual boxes, scores confidences, and classes the class.

ChrisCoutureDelValle · Oct 4, 2023

Matija Not sure I follow at which point to reshape the vectors, are you suggesting at some point in the output of the pipeline when leveraging NeuralNetwork()?

layers = in_nn.getAllLayers()
for layer_name in layers:
print("Layer name:", layer_name)
# get the "output" layer
output = np.array(in_nn.getLayerFp16('outputs'))
output = np.reshape(output, (8400, 7))
print(output)

boxes = output[:, :4] # get x, y, w, h
obj_scores = output[:, 4] # objectness score
class_scores = output[:, 5] # class score
class_ids = output[:, 6].astype(int) # class id

boxes_xyxy = xywh2xyxy(boxes)

# Combine [x, y, x2, y2, obj_score, class_id] as one array
detections = np.hstack((boxes_xyxy, obj_scores[:, np.newaxis], class_ids[:, np.newaxis]))
detections[:, 4] = detections[:, 4] * class_scores

total_classes = 2

boxes = non_max_suppression(detections, conf_thres=0.1, iou_thres=0.4)
#print("Shape of boxes:", boxes.shape)
boxes = np.array(boxes[0])
#print("Shape of boxes:", boxes.shape)

if boxes is not None:
frame = draw_boxes(frame, boxes, total_classes)

Matija · Oct 6, 2023

ChrisCoutureDelValle

Hi, if you are exporting it with end2end flag, then this seems like the way to go yes. Is it giving you some issues?

ChrisCoutureDelValle · Oct 6, 2023

Matija I cant use the end2end flag given
RuntimeError: Check 'unknown_operators.empty()' failed at frontends/onnx/frontend/src/core/graph.cpp:131: OpenVINO does not support the following ONNX operations: TRT.EfficientNMS_TRT

So when I export, and run on device I extract the boxes, scores and Ids from the output layer, which yields no solid results. In addiiton I have to use a script for NMS since it isnt supported by the openvino conversion.

Matija · Oct 6, 2023

ChrisCoutureDelValle

Yes, so you should be getting these outputs: ['num_dets', 'det_boxes', 'det_scores', 'det_classes'] correct?

In that case, you'll need to run the det boxes together with scores and classes through NMS script which you'd have to run on host. Let me know if that works for you.

ChrisCoutureDelValle · Oct 6, 2023

Matija No the output is ['outputs'] since I am not able to use end2end, which is a [,58800] aray that I then convert to [8400,7]. I do run the boxes and scores through the nms script but they get filtered out completely every time and yield no results (ie. Bounding boxes on device).

Matija · Oct 9, 2023

ChrisCoutureDelValle

Ah sorry, my bad. I looked to quickly over the code.

Let's go step by step then. For blobconverter params, let's use --mean_values=[0,0,0] --scale_values [255,255,255] --reverse_input_channels --data_type=FP16 flag. This will ensure that it properly normalizes the image. In YoloV6, the expected input image is RGB image with values in range [0, 1]. If we use default settings, the input images on OAK will be BGR with values in range [0, 255]. The above two flags reverse the channels and normalizes it to [0, 1].

The actual output layer will then contain this from a quick look. So you need to properly decode it to be in line with that. If you print out obj_scores, class_scores, and class_ids, do they make sense?

We are looking to add official support for this model to our tools, but is there any reason why you wouldn't want to use something like YoloV6 in the meantime?

ChrisCoutureDelValle · Oct 10, 2023

Matija
The outputs make sense but are seemingly being filtered out in NMS, what is the timeline for expected support to be added? I do use YOLOv6, etc but would like to stay on top of the newest models upon their release to see how they compare to other models in the YOLO family.

Matija · Oct 11, 2023

ChrisCoutureDelValle

Hey,

No ETA on official support as we have some other priorities at the moment. However, there is a notebook which lets you convert the model to blob: https://colab.research.google.com/drive/1GbmdTz7a2zdaZM2Owo8cooL6yZIqM0M_?usp=sharing.

There's a cell for upload .pt files, but you'll have to change the following line if they are named differently:

model = load_checkpoint("Gold_n_dist.pt", map_location=device)

The produced blob will work on device and please refer to the .json in the Colab. You can try using main_api.py with the blob and .json.

Let me know if that works. Tagging also @JanCuhel who prepared the Colab.

ChrisCoutureDelValle · Oct 12, 2023

Hello,

Thank you I will test this soon, much appreciated.

ChrisCoutureDelValle · Oct 12, 2023

Hello,

It works however the detection results seem low. Looking forwards to the official release!
I really appreciate you guys working through this with me.

Thanks,
Chris

JanCuhel · Oct 18, 2023

@ChrisCoutureDelValle

Hi,

I looked at the Google Colab and updated it. Now, the exported models should be faster. Could you please try it to see if it is better for you?

Thanks,
Jan