Depth calculation method

WWilbur · May 3, 2024

The OAK-D outputs two images along with pixel (x, y) coordinates.
The pixel coordinates represent an object.

How to calculate world coordinates from pixel coordinates in two images?
Do I need to consider the intrinsic and extrinsic parameters of the OAK-D cameras?

jakaskerl · May 6, 2024

Wilbur
This is quite a lengthy discussion so I will post a shortened GPT response:

Calculating world coordinates from pixel coordinates in stereo camera images, such as those produced by the OAK-D cameras, requires understanding both intrinsic and extrinsic parameters of the cameras. Here’s how you can approach this task using stereo vision techniques and principles:

1. Camera Calibration

Calibrate each camera to determine its intrinsic parameters, crucial for precise measurement and distortion correction.

Focal Lengths: f_x, f_y
Optical Center: c_x, c_y
Distortion Coefficients

2. Stereo Calibration

Calibrate the stereo pair by capturing images of a known pattern to determine relative positions and orientations.

Translation and Rotation: Parameters that define displacement and angular difference between cameras.

3. Stereo Rectification

Align images from two cameras so their corresponding epipolar lines are horizontal and aligned, simplifying correspondence finding.

4. Depth Calculation (Triangulation)

Calculate depth from disparity between corresponding points in rectified images:

Disparity: Horizontal difference between corresponding features in left and right images.
Depth Calculation:
- Z = (f * B) / d
- Where:
- f = focal length (assuming equal for x and y)
- B = baseline (distance between two cameras)
- d = disparity

5. World Coordinates Calculation

Convert depth and pixel coordinates to 3D world coordinates:

X Coordinate:
- X = ((x - c_x) * Z) / f
Y Coordinate:
- Y = ((y - c_y) * Z) / f
- Here, (x, y) are the pixel coordinates in the rectified image.

Implementation Steps

Follow these steps using the OpenCV library:

Calibrate Each Camera: Determine intrinsic and distortion parameters using calibration images.
Stereo Calibrate Cameras: Capture calibration images simultaneously from both cameras.
Rectify Images: Rectify images using calibration data.
Compute Disparity: Apply stereo matching algorithms to generate a disparity map.
Convert Disparity to Depth: Calculate depth using the disparity data.
Convert to World Coordinates: Translate depth and image coordinates into 3D space.