Question on trying to get the Oak-D pro to run Yolov6

Ddvdhwrd094 · Jul 26, 2023

Hello, I am using the Oak-d pro, and am getting an error that the input image resolution is not matching with the neural net expected resolution. I used the luxonics .blob conversion tool for the oak-d which only allows me to have the image shape be square and divisible by 32 or 64. As a result I cannot convert the .blob to recognize 1920x1080 which the camera defaults to even if I resize the image in the code to what the neural net is expecting. The Yolo model I am using is yolo6l.pt

jakaskerl · Jul 27, 2023

Hi dvdhwrd094
The input to a neural network should alway be RGB - which is only produced by preview output of the color camera. The size of that preview is set using setPreviewSize(W, H).

Thanks,
Jaka

erik · Jul 27, 2023

Also note that we recommend using https://tools.luxonis.com together with yolo device decoding.

Ddvdhwrd094 · Jul 28, 2023

import cv2
import depthai as dai
import numpy as np

def getFrame(queue):
# Get frame from queue
frame = queue.get()
# Convert frame to OpenCV format and return
return frame.getCvFrame()

# Start defining a pipeline
pipeline = dai.Pipeline()
print("depthai pipline defined")

# Set up RGB camera
rgb = pipeline.createColorCamera()
print("RGB pipeline created")
rgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
print("resolution set")
rgb.setIspScale(1920//1056, 1080//1056)
print("downscaling complete")

# Load the YOLO model
nn = pipeline.createNeuralNetwork()
print("neural net pipeline created")
nn.setBlobPath("../Yolov8CamColor/NNBlob/yolov6l_openvino_2022.1_6shave.blob")
print("found blob path")

# Link the RGB camera to the neural network
rgb.video.link(nn.input)
print("neural net linked to RGB")

# Create an output queue for the neural network
nnOut = pipeline.createXLinkOut()
nnOut.setStreamName("nn")
nn.out.link(nnOut.input)
print("completed output queue for nn")

# Create an output queue for the RGB camera
xoutRgb = pipeline.createXLinkOut()
xoutRgb.setStreamName("rgb")
rgb.video.link(xoutRgb.input)
print("completed output queue for RGB camera")

# Pipeline is defined
with dai.Device(pipeline) as device:
# Output queues will be used to get the rgb frames and nn data from the outputs defined above
rgbQueue = device.getOutputQueue(name="rgb", maxSize=1, blocking=False)
nnQueue = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
frame_width = 1056
frame_height = 1056

while True:
# Get the RGB frame
print("getting RGB frames")
frame = rgbQueue.get().getCvFrame()

# Get the detections
print("getting detections")
detections = nnQueue.get().getFirstLayerFp16()

# Only process detections if length is greater than 7
if len(detections) > 7:
# Loop over the detections and add them to your images
for i in range(0, len(detections), 7):
class_id = int(detections[i + 1])
confidence = detections[i + 2]
x_min = int(detections[i + 3] * frame_width)
y_min = int(detections[i + 4] * frame_height)
x_max = int(detections[i + 5] * frame_width)
y_max = int(detections[i + 6] * frame_height)

# Print out the size and content of detections for debugging
print(f"Size of detections: {len(detections)}")
print(f"Content of detections: {detections}")

# Loop over the detections and add them to images
for i in range(0, len(detections), 7):
class_id = int(detections[i + 1])
confidence = detections[i + 2]
x_min = int(detections[i + 3] * frame_width)
y_min = int(detections[i + 4] * frame_height)
x_max = int(detections[i + 5] * frame_width)
y_max = int(detections[i + 6] * frame_height)

# Draw a rectangle around the detected object and add a label
cv2.rectangle(frame, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)
cv2.putText(frame, str(class_id), (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36,255,12), 2)

# Display the frame
cv2.imshow("RGB", frame)

# Check for keyboard input
key = cv2.waitKey(1)
if key == ord('q'):
# Quit when q is pressed
break

This is what I am attempting to run, I have the camera set to use RGB with pipeline.createcolorcamera() and pass it to the NN. I did use the tool you recommended however I am getting the error [1944301041B4771300] [1.4.1] [3.173] [NeuralNetwork(1)] [warning] Input image (1920x1080) does not match NN (1056x1056). I also attempted to resize the image but this has not fixed it either.

Ddvdhwrd094 · Jul 28, 2023

I was able to link the setpreviewsize rather than the video to the NN and that did fix the issue with the sizes not matching but I am getting this error now [1944301041B4771300] [1.4.1] [2.295] [NeuralNetwork(1)] [error] Tried to allocate '86980608'B out of '74571263'B available.

[1944301041B4771300] [1.4.1] [2.303] [NeuralNetwork(1)] [error] Neural network executor '0' out of '2' error: OUT_OF_MEMORY

[1944301041B4771300] [1.4.1] [3.156] [NeuralNetwork(1)] [warning] Input image is interleaved (HWC), NN specifies planar (CHW) data order

[1944301041B4771300] [1.4.1] [6.111] [NeuralNetwork(1)] [critical] Fatal error in openvino 'universal'. Likely because the model was compiled for different openvino version. If you want to select an explicit openvino version use: setOpenVINOVersion while creating pipeline. If error persists please report to developers. Log: 'softMaxNClasses' '157' 'CMX memory is not enough!'.

[1944301041B4771300] [1.4.1] [9.174] [system] [critical] Fatal error. Please report to developers. Log: 'Fatal error on MSS CPU: trap: 00, address: 00000000' '0'

jakaskerl · Jul 29, 2023

Hi dvdhwrd094
rgb.video.link(nn.input) should be rgb.preview.link(nn.input).
Also rgb.setPreviewSize(1056, 1056), rgb.setInterleaved(False).

Thanks,
Jaka