Tiling with a spatial Neural Network

KarlGilmartin

Hi, I am using and OAK-D with an IMX378 as rgb. I'm using a MobileNetSpatialDetectionNetwork for detecting humans and to calculate there spatial co-ordinates. I would like to pass the whole frame into the NN, but the neural net only takes an input size of (300x300). I have downscaled the full frame to a single 300x300 frame but I notice that the range of detection is poor. I have used imageManip node to create multiple of tiles but this resulted in multiple NN nodes which seems to be inefficient. I have also tried to create the tiles in the main function not using ImageManip node and passing each tile into a single NN but this result in a lot of latency. Does anyone have any recommendations on what I should do to ensure

jakaskerl

KarlGilmartin I have downscaled the full frame to a single 300x300 frame but I notice that the range of detection is poor.

Could you explain that a bit? Usually 300x300 will be enough to confidently detect people in the frame, but might show some inaccuracies when these people are far away - essentially taking up less pixels.

Easiest thing you could do is just use a model with a different input layer size. There are a bunch of models available on OpenVino - look for person detection and person detection retail.

Keep in mind that the higher the input size, the lower the end fps will be.

Thanks,
Jaka