Question on trying to get the Oak-D pro to run Yolov6

dvdhwrd094

Hello, I am using the Oak-d pro, and am getting an error that the input image resolution is not matching with the neural net expected resolution. I used the luxonics .blob conversion tool for the oak-d which only allows me to have the image shape be square and divisible by 32 or 64. As a result I cannot convert the .blob to recognize 1920x1080 which the camera defaults to even if I resize the image in the code to what the neural net is expecting. The Yolo model I am using is yolo6l.pt

jakaskerl

Hi dvdhwrd094
The input to a neural network should alway be RGB - which is only produced by preview output of the color camera. The size of that preview is set using setPreviewSize(W, H).

Thanks,
Jaka

erik

Also note that we recommend using https://tools.luxonis.com together with yolo device decoding.

dvdhwrd094

import cv2
import depthai as dai
import numpy as np

def getFrame(queue):
# Get frame from queue
frame = queue.get()
# Convert frame to OpenCV format and return
return frame.getCvFrame()

# Start defining a pipeline
pipeline = dai.Pipeline()
print("depthai pipline defined")

# Set up RGB camera
rgb = pipeline.createColorCamera()
print("RGB pipeline created")
rgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
print("resolution set")
rgb.setIspScale(1920//1056, 1080//1056)
print("downscaling complete")

# Load the YOLO model
nn = pipeline.createNeuralNetwork()
print("neural net pipeline created")
nn.setBlobPath("../Yolov8CamColor/NNBlob/yolov6l_openvino_2022.1_6shave.blob")
print("found blob path")

# Link the RGB camera to the neural network
rgb.video.link(nn.input)
print("neural net linked to RGB")

# Create an output queue for the neural network
nnOut = pipeline.createXLinkOut()
nnOut.setStreamName("nn")
nn.out.link(nnOut.input)
print("completed output queue for nn")

# Create an output queue for the RGB camera
xoutRgb = pipeline.createXLinkOut()
xoutRgb.setStreamName("rgb")
rgb.video.link(xoutRgb.input)
print("completed output queue for RGB camera")

# Pipeline is defined
with dai.Device(pipeline) as device:
# Output queues will be used to get the rgb frames and nn data from the outputs defined above
rgbQueue = device.getOutputQueue(name="rgb", maxSize=1, blocking=False)
nnQueue = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
frame_width = 1056
frame_height = 1056

while True:
# Get the RGB frame
print("getting RGB frames")
frame = rgbQueue.get().getCvFrame()

# Get the detections
print("getting detections")
detections = nnQueue.get().getFirstLayerFp16()

# Only process detections if length is greater than 7
if len(detections) > 7:
# Loop over the detections and add them to your images
for i in range(0, len(detections), 7):
class_id = int(detections[i + 1])
confidence = detections[i + 2]
x_min = int(detections[i + 3] * frame_width)
y_min = int(detections[i + 4] * frame_height)
x_max = int(detections[i + 5] * frame_width)
y_max = int(detections[i + 6] * frame_height)

# Print out the size and content of detections for debugging
print(f"Size of detections: {len(detections)}")
print(f"Content of detections: {detections}")

# Loop over the detections and add them to images
for i in range(0, len(detections), 7):
class_id = int(detections[i + 1])
confidence = detections[i + 2]
x_min = int(detections[i + 3] * frame_width)
y_min = int(detections[i + 4] * frame_height)
x_max = int(detections[i + 5] * frame_width)
y_max = int(detections[i + 6] * frame_height)

# Draw a rectangle around the detected object and add a label
cv2.rectangle(frame, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)
cv2.putText(frame, str(class_id), (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36,255,12), 2)

# Display the frame
cv2.imshow("RGB", frame)

# Check for keyboard input
key = cv2.waitKey(1)
if key == ord('q'):
# Quit when q is pressed
break

This is what I am attempting to run, I have the camera set to use RGB with pipeline.createcolorcamera() and pass it to the NN. I did use the tool you recommended however I am getting the error [1944301041B4771300] [1.4.1] [3.173] [NeuralNetwork(1)] [warning] Input image (1920x1080) does not match NN (1056x1056). I also attempted to resize the image but this has not fixed it either.

dvdhwrd094

I was able to link the setpreviewsize rather than the video to the NN and that did fix the issue with the sizes not matching but I am getting this error now [1944301041B4771300] [1.4.1] [2.295] [NeuralNetwork(1)] [error] Tried to allocate '86980608'B out of '74571263'B available.

[1944301041B4771300] [1.4.1] [2.303] [NeuralNetwork(1)] [error] Neural network executor '0' out of '2' error: OUT_OF_MEMORY

[1944301041B4771300] [1.4.1] [3.156] [NeuralNetwork(1)] [warning] Input image is interleaved (HWC), NN specifies planar (CHW) data order

[1944301041B4771300] [1.4.1] [6.111] [NeuralNetwork(1)] [critical] Fatal error in openvino 'universal'. Likely because the model was compiled for different openvino version. If you want to select an explicit openvino version use: setOpenVINOVersion while creating pipeline. If error persists please report to developers. Log: 'softMaxNClasses' '157' 'CMX memory is not enough!'.

[1944301041B4771300] [1.4.1] [9.174] [system] [critical] Fatal error. Please report to developers. Log: 'Fatal error on MSS CPU: trap: 00, address: 00000000' '0'

jakaskerl

Hi dvdhwrd094
rgb.video.link(nn.input) should be rgb.preview.link(nn.input).
Also rgb.setPreviewSize(1056, 1056), rgb.setInterleaved(False).

Thanks,
Jaka