Hi dhaanpaa ,
One option would be to use 16:9 input size for YOLO, such as a few models here: depthai-model-zoo (see 640x352 models at the bottom). IMO this would be the best solution, and one that would provide best throughput / use the least memory.
Other option would be to use Script node to do what you are describing above. You could set camera FPS to some low value so NN can still process all frames without blocking, or perhaps keep track of all imgs sent to NN and all NN outputs (linked back to Script node) to measure throughput, and to not sent too many imgs. Then you could configure sequence number of frames sent to the NN (imgFrame.setSequenceNum(123)
) so you can later sync NN results together with original high-res img.
Thoughts?