DepthAI script stuck when using 12MP resolution

Yyishu_corpex · May 15, 2024

I carried out further experiments with this code, when I am setting the resolution to 12MP, the output of the ISP and Manip node queues is always None. Probably that is why it gets stuck when it sends None to NN node. But its strange that by just changing the resolution I am not getting any frames from the camera.

I tested it on another OAK-D camera which is the same model and I see the same behavior on that as well. I'm sure something is going on that I am not able to understand and debug. I have other scripts using the same NN model and 12MP resolution which work fine without this issue. But I want to look into what is going on in this script and why are we having this issue.

jakaskerl · May 15, 2024

Hi @yishu_corpex
Then it's certainly a pipeline issue, I'll run it tomorrow to see what the issue is.

Thanks,
Jaka

Yyishu_corpex · May 15, 2024

Hi @jakaskerl

Okay, sounds good. Thanks for your help. I look forward to hearing back from you regarding this pipeline issue.

Thanks,
Yishu

jakaskerl · May 16, 2024

Hi @yishu_corpex
Can you create a MRE please, so I can run it locally?

Thanks,
Jaka

Yyishu_corpex · May 16, 2024

Hi @jakaskerl

Please find the MRE below.

So I am using our own custom Yolov6n trained model. If you can share with me your official email id, then I can give you the access to the model blob and json file to reproduce.

I tried to reproduced it using the pretrained yolov4_tiny model provided by depthai for car detection under depthai-experiments. However i don't see any issue when I am using this model.

You will require the following libraries to run the script:

OpenCv Python
DepthAI
blobconverter
numpy

Also, put the model and json file paths in the argument parser lines as the default path. The script will itself pick that default path when you run it from VS code.

from pathlib import Path
import cv2
import depthai as dai
import numpy as np
import time
import argparse
import json
import blobconverter

# parse arguments 
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", help="Provide model name or model path for inference",
                    default='models/6_shaves_model/best_ckpt_openvino_2022.1_6shave.blob', type=str)
parser.add_argument("-c", "--config", help="Provide config path for inference",
                    default='models/6_shaves_model/best_ckpt.json', type=str)
args = parser.parse_args()

# parse config
configPath = Path(args.config)
if not configPath.exists():
    raise ValueError("Path {} does not exist!".forma/t(configPath))

with configPath.open() as f:
    config = json.load(f)
nnConfig = config.get("nn_config", {})

# parse input shape
if "input_size" in nnConfig:
    W, H = tuple(map(int, nnConfig.get("input_size").split('x')))

# extract metadata
metadata = nnConfig.get("NN_specific_metadata", {})
classes = metadata.get("classes", {})
coordinates = metadata.get("coordinates", {})
anchors = metadata.get("anchors", {})
anchorMasks = metadata.get("anchor_masks", {})
iouThreshold = metadata.get("iou_threshold", {})
confidenceThreshold = metadata.get("confidence_threshold", {})

print(metadata)

# parse labels
nnMappings = config.get("mappings", {})
labels = nnMappings.get("labels", {})
print("Labels: ", labels)

# get model path
nnPath = args.model
if not Path(nnPath).exists():
    print("No blob found at {}. Looking into DepthAI model zoo.".format(nnPath))
    nnPath = str(blobconverter.from_zoo(args.model, shaves = 6, zoo_type = "depthai", use_cache=True))
# sync outputs
syncNN = True

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
camRgb = pipeline.create(dai.node.ColorCamera)
detectionNetwork = pipeline.create(dai.node.YoloDetectionNetwork)
#xoutRgb = pipeline.create(dai.node.XLinkOut)
nnOut = pipeline.create(dai.node.XLinkOut)


# By Yishu
xoutISP = pipeline.create(dai.node.XLinkOut)
manip = pipeline.create(dai.node.ImageManip)
xoutManip = pipeline.create(dai.node.XLinkOut)

# Passthrough to debug
# Send passthrough frames to the host, so frames are in sync with bounding boxes
passthroughOut = pipeline.create(dai.node.XLinkOut)
passthroughOut.setStreamName("pass")
detectionNetwork.passthrough.link(passthroughOut.input)

#xoutRgb.setStreamName("rgb")
xoutISP.setStreamName("ISP")
nnOut.setStreamName("nn")
xoutManip.setStreamName("Manip")

# Properties
camRgb.setPreviewSize(W, H)

camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_12_MP) #THE_1080_P
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
camRgb.setFps(25) #40

# By Yishu
manip.initialConfig.setKeepAspectRatio(False) #True
manip.initialConfig.setResize(W, H)

# Change to RGB image than BGR - Yishu
manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p) 

# setMaxOutputFrameSize to avoid image bigger than max frame size error - Yishu
manip.setMaxOutputFrameSize(1228800)

# By Yishu
nnOut.input.setBlocking(False)
xoutISP.input.setBlocking(False)
xoutManip.input.setBlocking(False)

# By Yishu
nnOut.input.setQueueSize(10)
xoutISP.input.setQueueSize(10)
xoutManip.input.setQueueSize(10)
detectionNetwork.input.setQueueSize(10)

# Network specific settings
detectionNetwork.setConfidenceThreshold(confidenceThreshold)
detectionNetwork.setNumClasses(classes)
detectionNetwork.setCoordinateSize(coordinates)
detectionNetwork.setAnchors(anchors)
detectionNetwork.setAnchorMasks(anchorMasks)
detectionNetwork.setIouThreshold(iouThreshold)
detectionNetwork.setBlobPath(nnPath)
detectionNetwork.setNumInferenceThreads(2)
detectionNetwork.input.setBlocking(False)


camRgb.isp.link(manip.inputImage)
manip.out.link(detectionNetwork.input)

# By Yishu
manip.out.link(xoutManip.input)

detectionNetwork.out.link(nnOut.input)
camRgb.isp.link(xoutISP.input)

device_info = dai.DeviceInfo("192.168.220.10")

# Connect to device and start pipeline
with dai.Device(pipeline, device_info) as device:
    # Output queues will be used to get the rgb frames and nn data from the outputs defined above
    qDet = device.getOutputQueue("nn", 4, blocking=False) #device.getOutputQueue("nn", 1, blocking=False)
    qISP = device.getOutputQueue("ISP", 4, blocking=False)
    qManip = device.getOutputQueue("Manip", 4, blocking=False)

    # Passthrough to debug
    qPass = device.getOutputQueue(name="pass")

    frame = None
    detections = []
    startTime = time.monotonic()
    counter = 0
    color2 = (255, 255, 255)
    
    j=1
    ###Create the folder that will contain capturing
    folder_name = "D:\\Cameras_Live\\PayOff"
    path = Path(folder_name)


    while True:

        ###Create the folder that will contain capturing
        path.mkdir(parents=True, exist_ok=True)

        # By Yishu
        inISP = qISP.tryGet() # ISP QUEUE IS ALWAYS NONE WHEN SETTING THE RESOLUTION TO 12MP
        
        inManip = qManip.tryGet()
        print('1')
        frame_pass = qPass.get() #.getCvFrame()
        print('2')
        inDet = qDet.get()
        print('3')

        if inISP is not None:
            
            frame = inISP.getCvFrame()
            nn_fps  =counter / (time.monotonic() - startTime)
            print("nn_fps: ", nn_fps)
            cv2.putText(frame, "NN fps: {:.2f}".format(nn_fps),
                        (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4, color2)
        
        if inDet is not None:
            detections = inDet.detections
            counter += 1
        
        if cv2.waitKey(1) == ord('q'):
            break

        j=j+1
        if j==5:
            j=1

Thanks,
Yishu

jakaskerl · May 17, 2024

Hi @yishu_corpex
It's a depthai FW issue it seems, since the manip can't process the 12MP frames in certain W/H configurations. I have forwarded it to the dev team.

Thanks,
Jaka

Yyishu_corpex · May 17, 2024

Hi @jakaskerl

I'm curious why this firmware issue is only coming up with this script. I have other scripts where I use the same model and same configurations with the Manip node.

Also, how can we track the progress of the resolution of this issue?

Thanks,
Yishu

jakaskerl · May 19, 2024

Hi @yishu_corpex

yishu_corpex I have other scripts where I use the same model and same configurations with the Manip node.

Seems like a scaling issue from 12MP to W/H when using keepAspectRatio(False).

yishu_corpex Also, how can we track the progress of the resolution of this issue?

It's internal so you can't track it, but you can check the bugfixes section under new releases: luxonis/depthai-pythonreleases/tag/v2.25.1.0

Thanks,
Jaka

Yyishu_corpex · Jun 24, 2024

Hi @jakaskerl

I wanted to check with you if you can provide any update on the resolution of this issue.

I am again getting the same issue if I try to make use f YoloSpatialDetectionNetwork node instead of YoloDetectionNetwork node.

This time I'm getting this issue even when I'm using 1080P for RGB Camera and 400P for Left and Right Mono cameras.

Please let me know as it is becoming a major bottleneck for many of our projects.

Thanks & Regards
Yishu

jakaskerl · Jun 25, 2024

Hi @yishu_corpex

yishu_corpex I wanted to check with you if you can provide any update on the resolution of this issue.

Can't say for certain, our FW devs are working on it.

yishu_corpex Please let me know as it is becoming a major bottleneck for many of our projects.

Why not just use the preview? Do you absolutely need the full FOV? You can do something like this to still preserve the FOV:

# Properties
camRgb.setBoardSocket(dai.CameraBoardSocket.CAM_A)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_12_MP)
camRgb.setIspScale(1, 4)

w, h = camRgb.getIspSize()
camRgb.setPreviewSize(640, 400)
camRgb.setPreviewKeepAspectRatio(False)

xoutVideo.input.setBlocking(False)
xoutVideo.input.setQueueSize(1)

# Linking
camRgb.preview.link(xoutVideo.input)

Thanks,
Jaka

Yyishu_corpex · Jun 25, 2024

Hi @jakaskerl

This does preserve the FOV, however I see more pixel blur around the objects in this case compared to when using the ISP output and then using the ImageManip node to resize it to the NN input size.

This causes decrease in NN accuracy which is substantial enough that we can't ignore it.

I'll look into making changes to the pipeline and see if I can avoid the 12MP scaling FW issue. If not, then I think the best way for now would be to use the Preview as an input to the NN and then re-calculate the bounding boxes to display them on ISP 12MP image.

Thanks
Yishu

Yyishu_corpex · 2 Jun

Hi @jakaskerl @erik

Is there any update on the resolution for the FW issue mentioned above. I do see that when i run depthai 2.30 version in Python the issue seems to be resolved. However ,if I use depthai-core 2.30 version in C++ code which has exactly the same pipeline as Python code, i still get the same FW issue with 12 MP resolution. Its been more than a year and now this issue is causing us a major road block in order to deploy OAK-1 PoE cameras.

I look forward to hearing from you soon.

Thanks & Regards
Yishu

jakaskerl · 6 Jun

yishu_corpex
It shouldn't be an issue on the new v3 depthai. The Camera node has been revamped.

Thanks,
Jaka