It is very similar to the use cases described in the aforementioned discussions/issues.
The use case is about tracking an object based on inference done on the edge, as well as extra processing on the host. The host owns an improved estimate that needs to be communicated to the edge. But since the object can move fast, the rate needs to be high (60+Hz) and the overall system's accuracy and robustness improves when the cropped image precisely contains the object.
I will test using the scripts provided by the other discussions/issues with an OAK4 and re-visit this once I have something more concrete to share.