I have located the four corners of a box using depthai and opencv and have the pixel value of each point and the depth associated with them. I need to return these four points with their x, y and z coordinates all in meters instead of pixels. From the research I've done, I know I need the focal length and optical center in pixels in order to calculate this. I used the values from the intrinsic matrix from the calibration data for my camera. I tried the left, right, and rgb calibration values for the focal length and optical center, but I am still having trouble getting accurate x and y coordinates.
The experiment I am doing is calculating the x, y, and z for the corner of a box. Then, I stack another box on top of the bottom box without moving its location and calculate the x, y, and z values for the same corner of the top box. From what I understand, although the pixel values of the corners change because the depth changes, the x and y values should stay the same because I did not change the location the box is sitting. I am wondering if I am missing something or if there are some other values for focal length and optical center I should be using. These are two images of what I am seeing. The first one is one box, and the second one is two boxes stacked on top of each other. If you zoom in you can see the x, y, and z coordinates in red.