Hi glitchyordis , below is copied from gpt4, I hope it helps.
The code is converting the Region of Interest (ROI) from the 300x300 coordinate space to the actual sensor coordinate space.
The line roi = roi * self.resolution[1] // 300
scales the ROI coordinates to match the sensor's height, maintaining the aspect ratio.
The line roi[0] += (self.resolution[0] - self.resolution[1]) // 2
calculates the x offset for the device crop. This is necessary because the input image is cropped to maintain the aspect ratio, and the resulting cropped image is centered horizontally. By adding half the difference between the sensor's width and height to the x-coordinate, the code accounts for the horizontal offset introduced by the cropping.