Darknet model conversion

MernokAdriaan · Nov 18, 2024

Hi all,

I am trying to export the weights of a yolov4-tiny model that I have trained using darknet (https://github.com/hank-ai/darknet) to use it on a OAK-D type camera (actually a custom baseboard using the SoM).

Attached is my cfg file.

After training, I used the yolo2openvino repo (https://github.com/luxonis/yolo2openvino) to convert my model to TF model weights with the following command:

python ~/repos/yolo2openvino/convert_weights_pb.py \
--yolo 4 \
--weights_file ~/nn/test_project/test_project_best.weights \
--class_names ~/nn/test_project/test_project.names \
--output ~/nn/test_project/test_project.pb \
--tiny \
-h 160 \
-w 160 \
-a 4,18,9,35,15,61,27,100,48,138,95,151

Next, I converted to the OpenVINO IR format using the model optimizer (v2022.1), json config attached:

mo \
--input_model ~/nn/test_project/test_project.pb \
--tensorflow_use_custom_operations_config ~/nn/test_project/test_project.json \
--batch 1 \
--data_type FP16 \
--reverse_input_channel \
--model_name test_project \
--output_dir ~/nn/test_project/

Finally, I converted to a blobfile using blobconverter:

python -m blobconverter \
-ox nn/test_project/test_project.xml \
-ob nn/test_project/test_project.bin \
-sh 6 \
-v 2022.1 \
-o nn/test_project

Uploading this blob to the device, results in the following errors:

[14442C1071F33FD700] [1.1] [1.803] [SpatialDetectionNetwork(8)] [warning] Input image (160x160) does not match NN (3x160)
[14442C1071F33FD700] [1.1] [1.813] [SpatialDetectionNetwork(8)] [error] Mask is not defined for output layer with width '5'. Define at pipeline build time using: 'setAnchorMasks' for 'side5'.
[14442C1071F33FD700] [1.1] [1.813] [SpatialDetectionNetwork(8)] [error] Mask is not defined for output layer with width '10'. Define at pipeline build time using: 'setAnchorMasks' for 'side10'.

Here is a snippet from my YoloSpatialDetectionNetwork configuration:

spatial_nn.setNumClasses(1)
spatial_nn.setCoordinateSize(4)
spatial_nn.setAnchors([4, 18, 9, 35, 15, 61, 27, 100, 48, 138, 95, 151])
spatial_nn.setAnchorMasks({"side26": [0, 1, 2], "side13": [3, 4, 5]})
spatial_nn.setIouThreshold(0.5)

It is clear that somewhere along the line I've messed up the layers during the conversion.

Any tips on where I can start looking for errors?

I am using Python 3.7.17

MernokAdriaan · Nov 18, 2024

I wasn't able to upload the config or json files, so here goes:

test_project.cfg:

DarkMark v1.9.5-1 output for Darknet

# Project .... /home/adriaan/nn/test_project

# Config ..... /home/adriaan/nn/test_project/test_project.cfg

# Template ... /opt/darknet/cfg/yolov4-tiny-custom.cfg

# Username ... adriaan@ZOBLZO21X5GY3

# Timestamp .. Mon 2024-11-18 16:53:42 SAST

#

# WARNING: If you re-generate the darknet files for this project you'll

# lose any customizations you are about to make in this file!

#

[net]

# Testing

#batch=1

#subdivisions=1

# Training

batch=64

subdivisions=1

width=160

height=160

channels=3

momentum=0.9

decay=0.0005

angle=0

saturation=0.000000

exposure=0.000000

hue=0.000000

learning_rate=0.002610

burn_in=1000

max_batches=1000

policy=steps

steps=800,900

scales=.1,.1

cutmix=0

flip=0

max_chart_loss=6.000000

mixup=0

mosaic=0

use_cuda_graph=0

[convolutional]

batch_normalize=1

filters=32

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=64

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=64

size=3

stride=1

pad=1

activation=leaky

[route]

layers=-1

groups=2

group_id=1

[convolutional]

batch_normalize=1

filters=32

size=3

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=32

size=3

stride=1

pad=1

activation=leaky

[route]

layers = -1,-2

[convolutional]

batch_normalize=1

filters=64

size=1

stride=1

pad=1

activation=leaky

[route]

layers = -6,-1

[maxpool]

size=2

stride=2

[convolutional]

batch_normalize=1

filters=128

size=3

stride=1

pad=1

activation=leaky

[route]

layers=-1

groups=2

group_id=1

[convolutional]

batch_normalize=1

filters=64

size=3

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=64

size=3

stride=1

pad=1

activation=leaky

[route]

layers = -1,-2

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[route]

layers = -6,-1

[maxpool]

size=2

stride=2

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[route]

layers=-1

groups=2

group_id=1

[convolutional]

batch_normalize=1

filters=128

size=3

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=3

stride=1

pad=1

activation=leaky

[route]

layers = -1,-2

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[route]

layers = -6,-1

[maxpool]

size=2

stride=2

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

##################################

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[convolutional]

size=1

stride=1

pad=1

filters=18

activation=linear

[yolo]

mask = 3,4,5

anchors=4, 18, 9, 35, 15, 61, 27, 100, 48, 138, 95, 151

classes=1

num=6

jitter=.3

scale_x_y = 1.05

cls_normalizer=1.0

iou_normalizer=0.07

iou_loss=ciou

ignore_thresh = .7

truth_thresh = 1

random=0

resize=1.5

nms_kind=greedynms

beta_nms=0.6

[route]

layers = -4

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[upsample]

stride=2

[route]

layers = -1, 23

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[convolutional]

size=1

stride=1

pad=1

filters=18

activation=linear

[yolo]

mask = 0,1,2

anchors=4, 18, 9, 35, 15, 61, 27, 100, 48, 138, 95, 151

classes=1

num=6

jitter=.3

scale_x_y = 1.05

cls_normalizer=1.0

iou_normalizer=0.07

iou_loss=ciou

ignore_thresh = .7

truth_thresh = 1

random=0

resize=1.5

nms_kind=greedynms

beta_nms=0.6

test_project.json:

[

{

"id": "TFYOLOV3",

"match_kind": "general",

"custom_attributes": {

  "classes": 1,

  "anchors": [4, 18, 9, 35, 15, 61, 27, 100, 48, 138, 95, 151],

  "coords": 4,

  "num": 6,

  "masks": [[3, 4, 5], [0, 1, 2]],

  "entry_points": ["detector/yolo-v4-tiny/Reshape", "detector/yolo-v4-tiny/Reshape_4"]

}

}

]

jakaskerl · Nov 19, 2024

MernokAdriaan
Make sure you convert yolo-type models using https://tools.luxonis.com/ .

Thanks,
Jaka

MernokAdriaan · Nov 19, 2024

jakaskerl, this is a yolov4-tiny model though? That tool only seems to support YOLOv5 and onwards. I am trying to avoid the Ultralytics licensing costs while we commercialize our product while simultaneously learning more about the models and their limitations - if we get to a point where we cannot push the yolov4-tiny models further and there are actual measurable improvements with the Ultralytics models, then we will look at getting the required licensing.

jakaskerl · Nov 19, 2024

MernokAdriaan
Uh, right; We have v4 models available so it should be possible (maybe even through tools, but idk) cc @JanCuhel on what the differences are

Thanks,
Jaka

JanCuhel · Nov 19, 2024

Hi @MernokAdriaan,

yes, you're correct. Tools support conversion of YOLOv5 and later (that includes YOLOv6 and v7, both with GPL-3.0 license which aren't originally part of Ultralytics ecosystem).

Regarding these errors:
[14442C1071F33FD700] [1.1] [1.813] [SpatialDetectionNetwork(8)] [error] Mask is not defined for output layer with width '5'. Define at pipeline build time using: 'setAnchorMasks' for 'side5'.
[14442C1071F33FD700] [1.1] [1.813] [SpatialDetectionNetwork(8)] [error] Mask is not defined for output layer with width '10'. Define at pipeline build time using: 'setAnchorMasks' for 'side10'.

You need to update the keys of the dictionary in here:
spatial_nn.setAnchorMasks({"side26": [0, 1, 2], "side13": [3, 4, 5]})
to this (following this formula: sideX and sideY, where X = width/16 and Y = width/32)
spatial_nn.setAnchorMasks({"side10": [0, 1, 2], "side5": [3, 4, 5]})

Regarding this warning:
[14442C1071F33FD700] [1.1] [1.803] [SpatialDetectionNetwork(8)] [warning] Input image (160x160) does not match NN (3x160)
Please make sure that your model expects images 3x160x160. You use use netron.app and upload the generated .xml file there and you should see that the input should have the following shape: 1x3x160x160.

Best,
Jan

MernokAdriaan · Nov 19, 2024

Hi JanCuhel,

Thanks so much for the prompt response and the explanation on where the sideX and sideY comes from!
That seems to have solved the error.

W.r.t. the shape, here is a screenshot from the intermediary .pb file:

And then the .XML file:

So definitely something going wrong here. I'm curious why the shape of the .pb file is [?,160,160,3].
Sorry if these are stupid questions, I'm not familiar with all the weight file formats etc…

Another nudge in the right direction would be greatly appreciated!

MernokAdriaan · Nov 19, 2024

As a side note: I'm aware that v6 and v7 do not fall under the Ultralytics ecosystem, but the GPL-3.0 license is still a problem if I plan to use the models in closed source code no?

The legal jargon is very confusing to me.

PS: the licensing of the output feels really strange as many text editors are also licensed as such, but the works created from them do not fall under the same umbrella - sure you've heard this rant a million times before.

MernokAdriaan · Nov 21, 2024

I've marked this question as answered even though I am still struggling to get the model converted property (input shape seems to get messed up along the way). Even the pretrained yolov3, yolov3-tiny, yolov4, and yolov4-tiny conversions do not seem to be working with the yolo2openvino repository at this stage.

After some alterations to the requirements.txt file to fix the numerous compatibility warnings and errors, I am still having issues.

Python version (using pyenv):

Python 3.7.17

Updated requirements.txt:

tensorflow==1.14.0
numpy~=1.15.0
blobconverter==1.2.7
protobuf<=3.20
Pillow==9.5.0
gast~=0.2.0

Operating system: Ubuntu 22.04.5 LTS

Conversion command (example provided in the repo):

python convert_weights_pb.py \
--yolo 3 \
--class_names coco.names \
--output yolov3.pb \
--weights_file yolov3.weights \
--size 416

Traceback (most recent call last):

File "convert_weights_pb.py", line 94, in <module>

tf.app.run()

File "/home/adriaan/.pyenv/versions/yolo2openvino_lux/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run

_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)

File "/home/adriaan/.pyenv/versions/yolo2openvino_lux/lib/python3.7/site-packages/absl/app.py", line 308, in run

_run_main(main, args)

File "/home/adriaan/.pyenv/versions/yolo2openvino_lux/lib/python3.7/site-packages/absl/app.py", line 254, in _run_main

sys.exit(main(argv))

File "convert_weights_pb.py", line 78, in main

load_ops = load_weights(tf.global_variables(scope='detector'), FLAGS.weights_file)

File "/mnt/d/Github/External/luxonis/yolo2openvino/utils/utils.py", line 113, in load_weights

(shape[3], shape[2], shape[0], shape[1]))

ValueError: cannot reshape array of size 5628441 into shape (5604,1024,1,1)

I have forked the repo here and will continue to try and update the tool for darknet yolov4-tiny conversion for those that wish to convert models trained with the hank-ai/darknet repository.

Wish me luck!

JanCuhel · Nov 22, 2024

Hi @MernokAdriaan,

apologies for the delay in my response.

As a side note: I'm aware that v6 and v7 do not fall under the Ultralytics ecosystem, but the GPL-3.0 license is still a problem if I plan to use the models in closed source code no?

Yes, I believe so.

I've marked this question as answered even though I am still struggling to get the model converted property (input shape seems to get messed up along the way). Even the pretrained yolov3, yolov3-tiny, yolov4, and yolov4-tiny conversions do not seem to be working with the yolo2openvino repository at this stage.

Did you try to use the pre-trained weights from AlexeyAB/darknet repository or from hank-ai/darknet? If you used weights from hank-ai/darknet, there may be some differences in the implementations of the models causing the errors (however, please note that that is just a wild guess), so trying to convert weights from AlexeyAB/darknet might be worth a try.

I wish you all the luck you need and I hope you'll make it work!

Best
Jan

MernokAdriaan · Nov 26, 2024

To anyone experiencing similar issues - seems like it was the OpenVINO version that was the culprit all along - strange since I remember explicitly installing the version specified in the repo, but alas, after some struggling and recreating the venv a few times, I got a working model converted.

I will update the forked repo in due time and perhaps update all of the tooling and libraries to the latest supported versions (thinking of renaming it to something like "darknet2intelblob" - doing the conversion from darknet model, then outputting all stages, including the TF .pb, OpenVINO IR .xml and .bin, and then finally the .blob files).

Maybe even containerize it all as a learning exercise and avoid the library compatibility hell that is Python for all future users.

Thanks @JanCuhel for your support and guidance as well as MR.KILLZ over on the darknet discord channel.

JanCuhel · Nov 27, 2024

MernokAdriaan

That's awesome! Way to go!