If I use YoloSpatialDetectionNetwork to squeeze or letterbox my image (as described here) and then feed it to
YoloSpatialDetectionNetwork, should I expect the depth to work? It seems like there's no way for it to know how to align the disparity map. I've tried to figure this out experimentally but the results have been ambiguous. Is there a correct way to get full FOV and get the correct depth out of a Detection Network?