I'm curious about the algorithm that is performing RGB depth alignment in StereoDepth. Would you please describe the algorithm in detail?
If I had to guess, the algorithm would look roughly something like:
- Compute depth map for left (or right) mono image according to normal stereo matching process
- For each pixel in depth map, compute the disparity relative to the RGB frame using the depth at that pixel and the baseline to the RGB camera (disparity = f_x * baseline_to_RGB / depth). Note that the baseline in this equation is different than the one between the two mono cameras that was used during stereo matching and may need to be negative.
- Use computed disparity to compute the location in the RGB depth map corresponding to the current pixel in the mono depth map
- Put the depth at the current pixel of the mono depth map into the corresponding pixel of the RGB depth map
Is that the idea? Can you provide more detail? Does image rectification for the RGB image play a role and if so, would the final depth map produced by StereoDepth only be valid for the rectified RGB image?