YOLOv6 .pt to .blob Conversion Not Working

charb · 2025-03-26T20:28:28+00:00

Hello, I am trying to convert my YOLOv6 .pt file to .blob to get the accompanying .bin, .json, .xml, .blob, and .onnx files that tools.luxonis.com generates.

I tried using YOLOv6 (latest), and I keep getting this error:

I also tried YOLOv6 (R1), and YOLOv6 (R2, R3) and it just told me to use YOLOv6.

Here is a link to my .pt file:
Thanks

jakaskerl · 2025-03-27T13:07:10+00:00

charb
How was your yolo trained? Could you take a look at luxonis/depthai-ml-trainingblob/master/colab-notebooks/YoloV6_training.ipynb and check if you are missing something?

Thanks,
Jaka

charb · 2025-03-27T13:17:04+00:00

jakaskerl I used the Roboflow YOLOv6 training notebook found in this article. It seems to be a bit different than the one you provided, I'll give the one you provided a go in the meantime. Thanks!

charb · 2025-03-27T13:48:13+00:00

jakaskerl Nevermind, these seem to be the same. The only difference in the one that you linked is that it uses older pip libraries and that seems to make it unusable as the libraries don't like each other. Even after fixing the libraries, it seems like the older YOLOv6 doesn't work seamlessly.

TheOracle · 2025-03-27T23:56:29+00:00

Try this:
DepthAI Tools

I had those errors previously, but this worked for me fine. I'm like 101% sure something is broken on the other website.

Let me know how you go?

charb · 2025-03-28T00:01:22+00:00

TheOracle Thanks for your suggestion, but unfortunately that's the website I'm already trying. I think you're right though, the AxiosError seems to be something wrong with the tool :/ if anyone knows an alternative way of converting my .pt to a group of .bin, .json, .xml, .blob, and .onnx files, that would be much appreciated!

charb · 2025-03-28T04:38:08+00:00

I just tried re-training my YOLOv6-N model for the image sizes 640x352 and 1280x704. Both don't work in tools.luxonis.com still :/

Evaluation and Inference is working perfectly fine too. I have a feeling this is related to the tool.

@jakaskerl is there an alternative to generating .bin, .json, .xml, .blob, and .onnx files?

Here is the Google Drive with both the 640x352 and 1280x704 models:
https://drive.google.com/drive/folders/13aqDMZVpvj7-dGLVUIszfU0Cv-jmoNYB?usp=sharing

jakaskerl · 2025-03-28T17:04:24+00:00

charb

CC @JanCuhel

JanCuhel · 2025-03-30T20:48:25+00:00

Hi charb,

I'm sorry for the inconvenience you're experiencing. I will investigate the issue and update you as soon as I have more information.

Best regards,
Jan

charb · 2025-03-30T21:05:28+00:00

JanCuhel sounds good. Thanks.

charb · 2025-04-01T21:54:32+00:00

JanCuhel Hey Jan, sorry to bother you, but my project is on a tight schedule, any updates?

JanCuhel · 2025-04-01T22:33:10+00:00

Hey charb,

no worries, I understand. I've located the issue and am currently testing a fix for it. If everything goes well, I'll deploy the solution very soon.

Best,
Jan

charb · 2025-04-01T22:40:37+00:00

JanCuhel Thanks a lot Jan, looking forward to it

JanCuhel · 2025-04-03T16:25:14+00:00

Hi charb,

I apologize for my delayed response; fixing this issue is more cumbersome than expected. So, in the meantime, I've prepared a temporary solution for you using our other libraries (tools-cli and blobconverter). I've also exported all of your models for you. You can find everything here. Please note that I've exported the 640_352.pt model using input image shape 640x320 as the model requires input dimension to be multiple of the maximum stride, which for this particular model is 64.

Kind regards,
Jan

charb · 2025-04-03T23:03:28+00:00

JanCuhel Hi Jan, thanks a lot for your help. However, here is my script:

from depthai_sdk import OakCamera, ArgsParser
import argparse

latest_z_value = None  # Store the most recent Z distance

def process_detections(packet):
    global latest_z_value

    if packet is None or not hasattr(packet, 'detections'):
        print("No detections found.")
        return  

    for det in packet.detections:
        if hasattr(det, 'img_detection') and hasattr(det.img_detection, 'spatialCoordinates'):
            latest_z_value = det.img_detection.spatialCoordinates.z / 1000
            print(f"Drone detected {latest_z_value:.2f}m away.")
        else:
            print("No spatial data available.")


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("-conf", "--config", help="Trained YOLO json config path", default='models/v1/16shaves/droneDetection_v1.json', type=str)
    args = ArgsParser.parseArgs(parser)

    with OakCamera(args=args) as oak:
        color = oak.create_camera('color')
        nn = oak.create_nn(args['config'], color, nn_type='yolo', spatial=True)
        nn.config_nn(resize_mode='stretch')
        
        oak.callback(nn, process_detections)
        oak.visualize(nn, fps=True)
        oak.start(blocking=True)

if __name__ == "__main__":
    main()

As you can see, I am taking in the .json file and not the .blob. This is because the tools.luxonis.com tool would output a .bin, .json, .xml, .blob, and a .onnx file. Is there any chance that you can help with generating those.

Thanks!

JanCuhel · 2025-04-04T10:46:36+00:00

Hi charb,

I've added a Jupyter Notebook containing the code for generating the JSON configuration to the archive. I successfully ran the 1280_704.pt model on a device. However, the 640_384.pt and 640_352.pt models are more complex since both have an additional branch and require a mask definition—something standard YOLOv6 models don’t need (being anchor-less, the tools don’t generate masks for them). I've marked the section in the Notebook that needs updating (alternatively, you could directly update the generated configuration files). For inspiration, please review how we define masks in YOLOv5.

If you need further assistance, please let me know. In that case, I’d appreciate knowing which specific YOLOv6 version you used as the pre-trained model for the 640_384.pt and 640_352.pt variants.

Kind regards,
Jan

charb · 2025-04-04T15:00:02+00:00

JanCuhel Hi Jan, super helpful.

I downloaded the archive files and ran it on my script and noticed that the 640352 and 640384 are not the best at drone detection at a distance, but they run at 26-30fps. However, 1280_704 detects object at a distance, which is why I trained it on a larger resolution in the first place, but the fps is 4-6 :/

Do you know why there is a significant performance reduction? Is this related to the # of shaves?

JanCuhel · 2025-04-05T17:44:56+00:00

charb

If you're using the models I generated, the issue likely isn't the number of shaves since all converted models use 6 shaves. While optimizing the number of shaves might yield a slight improvement, the main performance factors are the model's size and the input image resolution—it’s always a trade-off. You might consider trying an intermediate resolution, such as 960x544 or 1088x608.

Best,
Jan