I am using an OAK-D, pre-ordered and shipped during the mass shipment in early January.
I am attempting to attain the distance from the camera (zdepth) of yolo-detected objects, using a few different methods, and have been unsuccessful.
Method 1: based on example 22_tiny_yolo_v3_device_side_decoding.py
Using this example, I've made two changes.
I downloaded the blob file from https://artifacts.luxonis.com/artifactory/luxonis-depthai-data-local/network/tiny-yolo-v3_openvino_2021.2_6shave.blob as suggested on the documentation page. I then changed tiny_yolo_v3_path
as follows:
tiny_yolo_v3_path = str((Path(__file__).parent / Path('models/tiny-yolo-v3_openvino_2021.2_6shave.blob')).resolve().absolute())
I added a new line at the end (immediately before the call to rectangle()
) to display the depth:
cv2.putText(frame, "{:.2f}".format(bbox.zdepth), (x1 + 10, y1 + 60), cv2.FONT_HERSHEY_TRIPLEX, 0.5, color)
This always displays depth 0.
I have also printed to the console the values of xdepth, ydepth, and zdepth with the following line of code. All three are zero:
print(f"x={bbox.xdepth} y={bbox.ydepth} z={bbox.zdepth}")
Method 2: use disparity data
I am able to pull disparity data, by adding the two mono cameras and StereoDepth to the pipeline. The problem is that with this yolo model, it requires a resolution of 416x416 in the rgb previewout, but I've been unable to figure out how to get a similar resolution on the disparity data. When I display both, it appears that the rgb preview is cutting off data from the two sides to accommodate the 416x416 dimensions. There is probably some way of matching / scaling / mapping the center of the detection onto the 720p or 400p disparity data, but this seems like the wrong way to go about this. (Method 1 is preferable, and if I need to use Method 2, it'd be nice to avoid that mapping.)
I was able to calculate disparity distances and it was working nicely, without any NN involved; just point the camera such that an object is placed behind a small rectangle painted in the center of the image, and then display the distance to that object based on disparity data within the rectangle's boundaries.
Suggestions welcome!
Thanks,
Derrell