Hi Team,
I’m reaching out for assistance with deploying a custom object detection model, trained using MobileNetSSD, onto the OAK-D-CM4 device. While I have successfully trained the model, I’ve been encountering challenges during deployment. Most of the documentation and community examples I’ve found seem to be deprecated or result in errors.
Here’s a brief summary of my current setup:

  • Model Architecture: MobileNet-SSD (trained using TensorFlow 1)

  • Exported Format: Converted to TFLite, then to OpenVINO IR using mo.py (with --add_postprocessing_op=false)

  • Device: OAK-D-CM4

  • Deployment Toolchain: DepthAI (Python API)

My understanding is that MobileNet-SSD models trained using TensorFlow 2 cannot be deployed on OAK devices. Please correct me if I’m mistaken.

Also, when attempting conversion with --add_postprocessing_op=True, I encountered model conversion issues. Hence, I proceeded without the postprocessing op. Please let me know if there's a better approach here.

I understand that models without built-in postprocessing (i.e., converted with --add_postprocessing_op=false) require host-side decoding of detection outputs. If there are up-to-date, working examples or recommendations for implementing the required postprocessing logic on the host, I would greatly appreciate that.

Additionally, if MobileNetSSD is not ideal anymore, are there other light-weight, non-YOLO models you would recommend for real-time object detection on OAK devices?

Looking forward to any guidance or updated references you can provide.

Best regards,
Nileena

Hi @nileena_thomas ,
you are correct in saying that the references for MobileNet-SSD can be a bit outdated. This is mostly due to this architecture being quite old (more than 5 years) and we don't really recommend it anymore for object detection because YOLO models achieve better performance and faster inference times.
We have one older notebook here which is also deprecated and won't work out of the box for the data loading and training part but you can check out the parts from the export part forward (modifiying the graph, using blobconverter and deployment) - but again note that this is not actively checked that it still works.
May I ask you why do you specifically want a non-YOLO model? If it is because of the license you might be interested in LuxonisTrain (our open-source training library) which has a predefined detection model that is based on the YOLO architecture but is under a more permissive Apache 2.0 and thus free to use in your application. You can check out a trainin tutorialg for it here.
Hope this helps,
Klemen

6 days later

Hi @KlemenSkrlj ,
Thank you for your inputs. I reviewed the links you shared and followed the same tutorial to train a custom model on my can defect dataset, using the provided configuration settings.

I archived the model using code(mentioned in notebook) and converted the ONNX model to a blob using blob.converter to run it on the OAK-D-CM4, which supports only DepthAI v2. However, I noticed that with a low confidence threshold, the outputs are noisy, and with a higher threshold, no detections are returned.

I plan to retrain the model with more epochs, but I wanted to check if there are any other factors I should focus on to improve performance. Also, based on your experience, how many epochs would you recommend for a dataset like this? And what would be a good loss value to aim for before stopping training?

I would suggest that you open the tensorboard tracker (you'll see the command under Train section) in a separate terminal (if training locally) and keep checking mainly the validation loss and metrics. You want to train until validation loss is still dropping (or metrics are still going higher) and stop when the reverse happens which could indicate overfitting to the training set. By default we save only top 3 best checkpoints so you don't need to perform this stop at the right point manually but if you see that the graph still has potential to go down (for loss or up for metrics) then this would indicate to resume training for more epochs (you can call .train(weights=<path to current best weights>)). Or you can set number of epochs to something really large and use EarlyStopping callback (add it to the config similarly as EMACallback is already there) that will stop the training automatically when overfitting is detected.
What you can also see in the Tensorboad tracker is visualization of model predictions at specific points. So you can also visually track how well the model is predicting.
Generally if you have the compute available I would go with something like 400 epochs.

A question regarding the dataset:
In this image it shows that you seem to have more images in the train split than there is total number of images. Do you get any warnings regarding that? Could you check luxonis_ml data health <name of the dataset> CLI command if there are any duplicates in your dataset maybe? Generally you want to avoid such cases.

Thanks, @KlemenSkrlj , for your response and helpful inputs. I’ll keep you posted on how training progresses with more epochs, and I’ll continue monitoring the metrics on TensorBoard.

Regarding your question about data health — the issue is due to duplicate UUIDs being generated during dataset creation. While the images themselves are not duplicates (they’re augmented versions with different filenames), the UUIDs assigned are the same, which is also triggering a warning. I’d appreciate any suggestions you might have on addressing this.

While the images themselves are not duplicates (they’re augmented versions with different filenames), the UUIDs assigned are the same, which is also triggering a warning

Are you sure that the images are not the same? The UUID is generated from image content so the same UUID would suggest that you actually have duplicates which I would advise against. Rather use augmentations through config which will at runtime augment your images.

Hi @KlemenSkrlj, I think I’ve identified what might be causing the duplicate UUID warnings:

  1. I’m using images labeled in YOLO format instead of XML, as shown in the tutorial.

  2. Each image has multiple annotations (i.e., multiple objects), and I suspect this is leading to the same UUID being assigned multiple times.

Here's a simplified version of the script I'm using to parse the dataset:

$$
import cv2
from luxonis_ml.data import DatasetIterator
from pathlib import Path

CLASS_NAMES = ["a", "b", "c"] # Example class names

def process_dir(dir_path: Path) -> tuple[DatasetIterator, list[str]]:
images = [str(i.absolute().resolve()) for i in dir_path.glob("*.jpg")]

def generator() -> DatasetIterator:
    for img_path in images:
        img_path = Path(img_path)
        txt_path = img_path.with_suffix(".txt")

        if not txt_path.exists():
            continue

        height, width, _ = cv2.imread(str(img_path)).shape

        with open(txt_path, "r") as f:
            for line in f:
                parts = line.strip().split()
                if len(parts) != 5:
                    continue

                class_id, x_center, y_center, w, h = map(float, parts)
                class_name = CLASS_NAMES[int(class_id)] if int(class_id) < len(CLASS_NAMES) else str(class_id)

                # Convert to top-left corner x, y
                x = x_center - w / 2
                y = y_center - h / 2

                yield {
                    "file": str(img_path),
                    "annotation": {
                        "class": class_name,
                        "boundingbox": {
                            "x": x,
                            "y": y,
                            "w": w,
                            "h": h
                        }
                    }
                }

return generator(), images

$$

Let me know if this interpretation makes sense. If not, I’m considering cleaning up the dataset by removing augmentations and retrying with a simpler version to confirm. Would appreciate your thoughts on how best to handle this.

When you yield something in the generator we group those by file yes. So if you have multiple same files (same thing on the actual picture) you would add duplicates to the dataset. At the moment we don't yet have an automatic method to cleanup those duplicates so you would need to handle them manually, in the generator. What I advise you do is run the health command, this prints out the file names of duplicated UUIDs and then you check in the generator that only one file name per group (if you group by same UUID) is added.

But you can add multiple annotations for same file, there is no issue with that (e.g. multiple bounding boxes per image).

Hi @KlemenSkrlj ,

I followed your suggestion and was able to resolve the error—thank you! The model has now been trained, but we're seeing many false positives, especially on empty conveyor backgrounds. To address this, we added empty background images with corresponding empty label files (i.e., files with no annotations) to the dataset.

However, I’m now encountering this warning:

WARNING: BBox annotation has values outside of [0, 1] range. Clipping them to [0, 1].

I’ve double-checked the annotation files, and none of them seem to contain values outside the [0, 1] range. Any suggestions on what might be causing this or how to resolve it would be appreciated.

This check is peformed for each yield of bounding boxes (here). Would you be able to add a check on your side inside the generator before the yield? It might be a bug on our end althought we can't reproduce it so if you have a MRE it would be greatly appreciated - you can also post it as in issue in the luxonis-ml repository.

Hi @KlemenSkrlj

Thank you for your feedback. I initially checked for values outside the [0,1] range and found none. However, after printing values immediately before the generator, I noticed very small negative numbers—for example, 0.2637475 -4.999999999588667e-07 0.469173 0.719787—where the second coordinate is slightly below zero. Could such tiny deviations be causing the error?

I’m wondering why these don’t appear in my label file checks. Could it be due to the formatting precision (e.g., rounding with .15f) in the checks that masks these near-zero negative values?

I see. You don't really need to worry about this though because as the warning said this is being automatically clipped to range [0,1] so the LuxonisDataset has annotations which are valid. But yeah might be that this didn't show up in previous test you did because of the precision

14 days later

Hi @KlemenSkrlj ,

Thank you so much for your support—it’s been really helpful!

Quick question: how can I use a Luxonis-trained model to generate inference results in label format? I’d like to use these as pre-labels for uploading the next batch of images into annotation software.

Thanks again!

You can check out the newly added annotate feature (or model.annotate() as python API here). It takes a path to directory of images, runs inference on each image and creates a new LuxonisDataset as an output.
If you use the pre-annotated dataset then outside of our stack I would say the easiest would be if you export the dataset into LDF NATIVE type (command: luxonis_ml data export <dataset_name>) and you parse it in a format that you want. The parsing of NATIVE exported data should be pretty straightforward, you can also look here for references.
Hope this helps