Hello,

I have gone through the RGB-MobileNet SSD example and found that it can detect only around 20 classes.
I am looking for a model trained on the COCO dataset (all 80 classes)
I found the effDet and the yolo examples, but the mobilenet ssd was much faster in terms of frame rate.

After a long full-day search on luxonis examples, github, openvino, unfortunately I resulted with empty hand.

Could anyone please guide me where to find a mobilenet ssd trained on the entire COCO dataset?

Much thanks,

    Hey hussain_allawati ,

    we have a YoloV4-tiny trained on whole COCO dataset. You can get the blob path by using:

    model_path = blobconverter.from_zoo(name="yolov4_tiny_coco_416x416",
                                            zoo_type="depthai",
                                            shaves=6,
                                            use_cache = True,
                                            version = blobconverter.Versions.v2021_4)
     model_path = str(model_path)                                    

    This will download the blob and you can use it with our YoloDetectionNetwork node. But you can just use the example at the link below, as it uses this network by default:
    https://github.com/luxonis/depthai-experiments/tree/master/gen2-yolo/device-decoding

    Best,
    Matija

      6 days later

      Matija
      Hello Matija,
      Thank you for your reply.
      I have already and tried the YoloV4-tiny model, however, I would prefer having the SSD MobileNet since it would be much faster on the OAK-1

      Thanks,

        Hey hussain_allawati

        I compiled the model from TF2 Object Detection API. You can find the license of the model in the linked repository.

        Here is the model. You can put it in depthai/resources/nn and run python3 depthai_demo.py -cnn mobilenet_coco. I also provided XML and BIN, which I'll be uploading to our model zoo soon, so you'll also be able to get it using blobconverter.

        I haven't tested the FPS compared to yolov4-tiny or yolov3-tiny. If you do, please share the comparison here 🙂

        Best, Matija

          6 months later

          Hello Matija,

          When running the mobilenet_coco I get around 10FPS. I also did a conversion of it in the blobconverter with 8 shaves, and that gives a couple more FPS(max 5 more). I can also see that the confidence of the detections is a bit lower compared to the depthAI's mobilenet that is not trained on coco.

          Any idea why that might be?

            12 days later

            Hi JoeJonshon ,

            What do you mean by depthAI's mobilenet? If you mean the model in depthAI demo, it is pre-trained on PASCAL VOC dataset. Since model is lightweight, it could better learn the features of fewer classes and could thus have a higher confidence for those compared to mobilenet_coco, which is trained on 80 classes from COCO dataset. Different training techniques and other parameters used during the training could also affect the final predictions.

            Regarding the FPS, there are several factors that can impact that as well. Higher input shape means more operations which decreases the FPS. More classes also result in more parameters and more operations, which similarly affects the FPS. Furthermore, final FPS also depends on your pipeline. If you use multiple nodes and stereo in your pipeline, some of the shaves will be used by those nodes, and consequently less shaves are available to NN, which can further decrease the FPS. If you use exactly the same pipeline and you've only changed the model, I'd have to dig a bit deeper into the models too see why one might be slower. But if the FPS difference between the two is relatively small, the reason is explained in the first two sentences of this paragraph.

            Best,
            Matija