Hi @Thor,
The input image shape
argument in tools specifies the resolution of the images accepted by a specific YOLO model that is exported for our devices.
The reason why you can infer a YOLO model with different image resolutions is because the YOLO models are fully convolutional. Opposite to fully connected layers, where we need to specify the exact number of neurons in each layer, in the case of convolutional layers, we specify the size and number of the used kernel filters, stride, etc. This makes the fully convolutional neural networks independent of the input image resolution (though you can't infer the model with whatever image resolution, usually the image resolution needs to be divisible by 32 in the case of YOLO models).
In terms of performance, a larger image resolution results in more visual information (higher level of detail) that can be used during inference. This means that the model should perform slightly better on images with higher resolutions as the pictures are more sharp and not so blurry. On the other hand, using higher input image resolution results in slower inference, as the model has more data, so it naturally requires more time to process it.
I hope that this answers your question. If you don't understand something or want to ask additional questions, please don't hesitate to do so!
Best,
Jan