Hello, I am using the Oak-d pro, and am getting an error that the input image resolution is not matching with the neural net expected resolution. I used the luxonics .blob conversion tool for the oak-d which only allows me to have the image shape be square and divisible by 32 or 64. As a result I cannot convert the .blob to recognize 1920x1080 which the camera defaults to even if I resize the image in the code to what the neural net is expecting. The Yolo model I am using is yolo6l.pt

    Hi dvdhwrd094
    The input to a neural network should alway be RGB - which is only produced by preview output of the color camera. The size of that preview is set using setPreviewSize(W, H).

    Thanks,
    Jaka

    import cv2
    import depthai as dai
    import numpy as np

    def getFrame(queue):
    # Get frame from queue
    frame = queue.get()
    # Convert frame to OpenCV format and return
    return frame.getCvFrame()

    # Start defining a pipeline
    pipeline = dai.Pipeline()
    print("depthai pipline defined")

    # Set up RGB camera
    rgb = pipeline.createColorCamera()
    print("RGB pipeline created")
    rgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
    print("resolution set")
    rgb.setIspScale(1920//1056, 1080//1056)
    print("downscaling complete")

    # Load the YOLO model
    nn = pipeline.createNeuralNetwork()
    print("neural net pipeline created")
    nn.setBlobPath("../Yolov8CamColor/NNBlob/yolov6l_openvino_2022.1_6shave.blob")
    print("found blob path")

    # Link the RGB camera to the neural network
    rgb.video.link(nn.input)
    print("neural net linked to RGB")

    # Create an output queue for the neural network
    nnOut = pipeline.createXLinkOut()
    nnOut.setStreamName("nn")
    nn.out.link(nnOut.input)
    print("completed output queue for nn")

    # Create an output queue for the RGB camera
    xoutRgb = pipeline.createXLinkOut()
    xoutRgb.setStreamName("rgb")
    rgb.video.link(xoutRgb.input)
    print("completed output queue for RGB camera")

    # Pipeline is defined
    with dai.Device(pipeline) as device:
    # Output queues will be used to get the rgb frames and nn data from the outputs defined above
    rgbQueue = device.getOutputQueue(name="rgb", maxSize=1, blocking=False)
    nnQueue = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
    frame_width = 1056
    frame_height = 1056

    while True:
    # Get the RGB frame
    print("getting RGB frames")
    frame = rgbQueue.get().getCvFrame()

    # Get the detections
    print("getting detections")
    detections = nnQueue.get().getFirstLayerFp16()

    # Only process detections if length is greater than 7
    if len(detections) > 7:
    # Loop over the detections and add them to your images
    for i in range(0, len(detections), 7):
    class_id = int(detections[i + 1])
    confidence = detections[i + 2]
    x_min = int(detections[i + 3] * frame_width)
    y_min = int(detections[i + 4] * frame_height)
    x_max = int(detections[i + 5] * frame_width)
    y_max = int(detections[i + 6] * frame_height)

    # Print out the size and content of detections for debugging
    print(f"Size of detections: {len(detections)}")
    print(f"Content of detections: {detections}")

    # Loop over the detections and add them to images
    for i in range(0, len(detections), 7):
    class_id = int(detections[i + 1])
    confidence = detections[i + 2]
    x_min = int(detections[i + 3] * frame_width)
    y_min = int(detections[i + 4] * frame_height)
    x_max = int(detections[i + 5] * frame_width)
    y_max = int(detections[i + 6] * frame_height)

    # Draw a rectangle around the detected object and add a label
    cv2.rectangle(frame, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)
    cv2.putText(frame, str(class_id), (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (36,255,12), 2)

    # Display the frame
    cv2.imshow("RGB", frame)

    # Check for keyboard input
    key = cv2.waitKey(1)
    if key == ord('q'):
    # Quit when q is pressed
    break

    This is what I am attempting to run, I have the camera set to use RGB with pipeline.createcolorcamera() and pass it to the NN. I did use the tool you recommended however I am getting the error [1944301041B4771300] [1.4.1] [3.173] [NeuralNetwork(1)] [warning] Input image (1920x1080) does not match NN (1056x1056). I also attempted to resize the image but this has not fixed it either.

    I was able to link the setpreviewsize rather than the video to the NN and that did fix the issue with the sizes not matching but I am getting this error now [1944301041B4771300] [1.4.1] [2.295] [NeuralNetwork(1)] [error] Tried to allocate '86980608'B out of '74571263'B available.

    [1944301041B4771300] [1.4.1] [2.303] [NeuralNetwork(1)] [error] Neural network executor '0' out of '2' error: OUT_OF_MEMORY

    [1944301041B4771300] [1.4.1] [3.156] [NeuralNetwork(1)] [warning] Input image is interleaved (HWC), NN specifies planar (CHW) data order

    [1944301041B4771300] [1.4.1] [6.111] [NeuralNetwork(1)] [critical] Fatal error in openvino 'universal'. Likely because the model was compiled for different openvino version. If you want to select an explicit openvino version use: setOpenVINOVersion while creating pipeline. If error persists please report to developers. Log: 'softMaxNClasses' '157' 'CMX memory is not enough!'.

    [1944301041B4771300] [1.4.1] [9.174] [system] [critical] Fatal error. Please report to developers. Log: 'Fatal error on MSS CPU: trap: 00, address: 00000000' '0'

      Hi dvdhwrd094
      rgb.video.link(nn.input) should be rgb.preview.link(nn.input).
      Also rgb.setPreviewSize(1056, 1056), rgb.setInterleaved(False).

      Thanks,
      Jaka