Calculating the real-world dimensions of specific pixel values

AAbbyM · Jul 31, 2024

I have located the four corners of a box using depthai and opencv and have the pixel value of each point and the depth associated with them. I need to return these four points with their x, y and z coordinates all in meters instead of pixels. From the research I've done, I know I need the focal length and optical center in pixels in order to calculate this. I used the values from the intrinsic matrix from the calibration data for my camera. I tried the left, right, and rgb calibration values for the focal length and optical center, but I am still having trouble getting accurate x and y coordinates.

The experiment I am doing is calculating the x, y, and z for the corner of a box. Then, I stack another box on top of the bottom box without moving its location and calculate the x, y, and z values for the same corner of the top box. From what I understand, although the pixel values of the corners change because the depth changes, the x and y values should stay the same because I did not change the location the box is sitting. I am wondering if I am missing something or if there are some other values for focal length and optical center I should be using. These are two images of what I am seeing. The first one is one box, and the second one is two boxes stacked on top of each other. If you zoom in you can see the x, y, and z coordinates in red.

erik · Aug 1, 2024

Hi @AbbyM ,
that would only work if you know the distance to the object - which you do not. So I'd suggest looking at our box dimensioning demo, which uses depth to estimate these dimensions (and we could also calculate coordiantes of corners): https://discuss.luxonis.com/blog/5021-box-measurement-with-ai-and-tof

AAbbyM · Aug 1, 2024

Thank you. By distance do you mean the Euclidean distance between the box and the camera because I was under the impression that I could get the x and y coordinates if I knew the depth, pixel location, the focal length, and optical center? Also, I took a look at the link above and the example uses a ToF sensor. Is there a way of accomplishing the similar task that I am working on without having a ToF sensor or is this needed to get the real-world coordinate values.

erik · Aug 1, 2024

@AbbyM you can also use stereo camera (which we have previously used):
luxonis/depthai-experimentstree/master/gen2-box_measurement/api

x and y coordinates if I knew the depth, pixel location, the focal length, and optical center?

You can, example here:
luxonis/depthai-experimentstree/master/gen2-calc-spatials-on-host

LLuuk0312 · Oct 11, 2024

erik Do you have a C++ equivalent for calculating the depth value at individual pixels?

erik · Oct 11, 2024

Hi @Luuk0312 ,
we do not, but C++ and Python API are 1:1, so you can use GPT to translate it to c++.