Do you have some example images about detection of people in the distance? Mainly to see what you're targeting.
While heavier models could help, I believe the problem here is that resolution is not big enough, and so the distance people are just too small for the model to be able to detect them well. There are two ways that could solve that -- either a very high input resolution, which is slow. I am not sure how fast it would be on a Jetson -- my assumption is it should be faster since it has more tops, but since you increase the amount of operations by a lot it's hard to say what the FPS would be.
Another option that you have, is that instead of using default yolov8.yaml when finetuning/transfer learning, you use the yolov8-p2.yaml. You can see in the config in L40 it says xsmall P2, and then that P2 is passed to the Detect head in the last line. This means that the model will be trained to detect much smaller objects, while the throughput should remain more or less the same. So you can try fine-tuning with that head.
@JanCuhel , can we check why the converted model linked above is not detecting anything unless the mentioned fix is applied? And can we try exporting a model trained with yolov8-2.yaml config using tools.luxonis.com.
@krishnashravan What DepthAI version are you using?