How to apply a model which has bigger input size than the OAK-D's resolution
Hi franva ,
I'm sorry to say that no it is not possible to run a model higher than OAK-D's resolution. For example running the RGB sensor at 4K (3840 × 2160) resolution instead of 1080p (1920x1080) resolution.
An example of doing so is here:
https://docs.luxonis.com/projects/api/en/latest/samples/rgb_mobilenet_4k/#rgb-mobilenetssd-4k
That said, what model are you trying to run? Generally neural networks of such high resolution would take more DDR than OAK-D has. But perhaps you have a specialized model that is very high resolution but low overall compute? That possibility exists - and in which case you could run the sensor at 4K (3840 × 2160) instead.
Thoughts?
Thanks,
Brandon
This was already discussed further on our Discord server. Solution; set sensor resolution to 4k and preview size to to desired size, eg:
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_4_K)
rgbCam.setPreviewSize(2048,1024)
And for reference, franva tried to run semantic-segmentation-adas-0001 that has input size 2048x1024
.
Understood, I guess it is not possible to use the models from Model Zoo simply because that those models require very high resolution which OAK-D does not have.
Thanks for the direct answer.
Then Model Zoo is not that helpful. I guess I will have to re-train models with a smaller input size in order to use on OAK-D.
Am I right??
Hi @franva ,
Understood, I guess it is not possible to use the models from Model Zoo simply because that those models require very high resolution which OAK-D does not have.
Thanks for the direct answer.
Yes, that's correct. Some of the models on the model zoo are meant for use in server farms where thousands of TOPS are a minimum amount of processing power.
So the AI landscape in general has a huge variance of processing power. For example many of the larger cloud-AI companies (Google, for example) have likely millions or billions of TOPS available for neural inference.
In the case of OAK-D, we have 1.4 TOPS for neural inference. So models that are intended for cloud-deployment either will not run, or will run too slowly to be useful for real-time applications.
Then Model Zoo is not that helpful. I guess I will have to re-train models with a smaller input size in order to use on OAK-D.
I wouldn't say it's not useful. We use it quite a bit. And here is a list of 12 models that we use fairly often from the model zoo:
https://docs.luxonis.com/en/latest/pages/tutorials/pretrained_openvino/#trying-other-models
I think the more accurate thing to say is that the model zoo is very broad: It has models intended for cloud AI application (where 1,000 TOPS is the minimum compute), for telco-edge (where 100s or TOPS are the minimum), and for more device-edge (1 TOPS or so).
OAK-D is actually on the embedded side, so device edge or smaller. So the models that are on the model zoo that are intended for cloud AI or telco-edge are not as applicable to OAK-D. But the device-edge (or embedded) ones (liked linked above) are quite relevant.
Thoughts?
Thanks,
Brandon
Brandon Thanks Brandon for your timely reply
Thanks for the explanation.
So I guess, if we need to do something outside the 7 pretrained models, then we will have to train the model ourselves, right?
And I assume this is the notebooke for training models.
https://colab.research.google.com/github/superannotateai/model-deployment-tutorials/blob/main/OAK/SuperAnnotate_OAK_Deeplabv3%2B_Deployment.ipynb
Right?
franva We have many more models running on our depthai-experiments, I would say at least 20 additional ones. They are just not supported by the depthai-demo (yet). For ML training tutorials, we also have a set of our own at depthai-ml-training, but we do recommend using Roboflow for training custom object detection models, here's more information on that.
Thanks, Erik
thanks erik Appreciated for putting up the models.
I have had a look at Roboflow and found we need to pay for that.
Do we have any open source alternatives??
I found 2:
https://github.com/qubvel/segmentation_models.pytorch
https://github.com/Tramac/awesome-semantic-segmentation-pytorch
They seem very popular.
Especially the 1st one has the Camvid cars segmentation notebook.
I hope I could convert it to road segmentation model which can run smoothly on OAK-D/1 (I did notice that the Intel OpenVINO has road segmentation model already, however it's not usable on OAK-D with limited resources)
Please let me know if you had a better option. Because I thought this(re-use an existing model or training your own model) should be an easy procedure, but it turns out to be problematic.
Cheers,
Winston
- Edited
Hello franva,
I am not sure, but there are plenty of opensource Colab notebooks for ML training. Before converting these models, you should just make sure that they aren't too computationally expensive (as the others pretrained openvino models that you have tried). Training AI models without the help of platforms (eg roboflow) isn't easy for a number of reasons - unfortunately I don't have any other suggestions besides the ones I already mentioned.
Thanks, Erik