Y
yishu_corpex

jakaskerl · Aug 17, 2024

Can you add some minimal reproducible code? It's hard to say what is going on here.
The YUV420 is stored as an array of first Y (luma) values and then subsampled U and V (chroma) values. It doesn't have channels. If you do a getData() on 1080p image you will see it has a size of 2,073,600 (1920*1080 Y) + 518,400 (540*960 U) + 518,400 (540*960 V) = 3,110,400 pixels. It's all in a single channel. The shape of that would be (1620, 1920).
When converting to BGR via getCvFrame() the U and V layers are upsampled to Y dimensions and combined together to form BGR (3 channel data).

Thanks,
Jaka

erik · Aug 7, 2024

Hi @yishu_corpex ,
OAK-D-SR does not have a ToF sensor - only OAK-D-SR-POE has ToF sensor.

erik · Aug 7, 2024

Hi @yishu_corpex ,
very likely you are over the bandwidth limit (docs here: https://docs.luxonis.com/software/depthai/optimizing/#Optimizing%20FPS%20%26%20Latency-Bandwidth), so it's dropping color frames, which are the largest. I'd suggest encoding high-res frames to reduce bandwidth and avoid frame-dropping.
Thanks, Erik

erik · Jul 25, 2024

@yishu_corpex yes, python script is running within Script node on the device, and it handles HTTP server. Yes, should be 1080P by default if sensor supports it.

erik · Jul 25, 2024

@yishu_corpex so this example doesn't work for you?
https://docs.luxonis.com/software/depthai/examples/script_http_server/
Any logs you can share?
It depends on the program you are using, but usually clients support MJPEG stream, and OAK can directly provide that (with code linked above).

erik · Jul 25, 2024

Hi @yishu_corpex ,
H264/H265 is much more complex than MJPEG, so you can't just create a GET server and expect the browser will display H264 stream out of the box (like for MJPEG). Perhaps consult GPT4 on how to achieve that (both from serving, and from displaying perspective).

jakaskerl · Jul 9, 2024

Hi @yishu_corpex
I tried with 12MP and the pipeline doesn't crash. Are you using a POE device by any chance (or a low bandwidth USB)? You can check the pipeline graph how the FPS slowly starts do decline because the XLink nodes can't send frames fast enough.

Thanks,
Jaka

jakaskerl · Jul 8, 2024

Hi @yishu_corpex
Seems like the Xlink was stressed out, so I removed unsynced outputs, I also changed all inputs to non-blocking. I can get about 10FPS whole.

PRO-TIP: use the pipeline_graph tool, it comes in handy in case of pipeline failures.


import cv2
import depthai as dai
import numpy as np
import argparse
import json
import time
import sys
from pathlib import Path
import blobconverter
from datetime import timedelta


def getDisparityFrame(frame, cvColorMap):
    maxDisp = stereo.initialConfig.getMaxDisparity()
    disp = (frame * (255.0 / maxDisp)).astype(np.uint8)
    disp = cv2.applyColorMap(disp, cvColorMap)

    return disp

# parse arguments 
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", help="Provide model name or model path for inference",
                    default='models/best_ckpt_openvino_2022.1_6shave.blob', type=str)
parser.add_argument("-c", "--config", help="Provide config path for inference",
                    default='models/best_ckpt.json', type=str)
args = parser.parse_args()


# parse config
configPath = Path(args.config)
if not configPath.exists():
    raise ValueError("Path {} does not exist!".format(configPath))

with configPath.open() as f:
    config = json.load(f)
nnConfig = config.get("nn_config", {})

# parse input shape
if "input_size" in nnConfig:
    W, H = tuple(map(int, nnConfig.get("input_size").split('x')))

# extract metadata
metadata = nnConfig.get("NN_specific_metadata", {})
classes = metadata.get("classes", {})
coordinates = metadata.get("coordinates", {})
anchors = metadata.get("anchors", {})
anchorMasks = metadata.get("anchor_masks", {})
iouThreshold = metadata.get("iou_threshold", {})
confidenceThreshold = metadata.get("confidence_threshold", {})

#print(metadata)

# parse labels
nnMappings = config.get("mappings", {})
labels = nnMappings.get("labels", {})
#print("Labels: ", labels)

# get model path
nnPath = args.model
if not Path(nnPath).exists():
    print("No blob found at {}. Looking into DepthAI model zoo.".format(nnPath))
    nnPath = str(blobconverter.from_zoo(args.model, shaves = 6, zoo_type = "depthai", use_cache=True))

# sync outputs
syncNN = True

# Depth and Disparity
outRectified = False  # Output and display rectified streams
lrcheck = True  # Better handling for occlusions
extended = False  # Closer-in minimum depth, disparity range is doubled
subpixel = True  # Better accuracy for longer distance, fractional disparity 32-levels
depth = True  # Display depth frames
swap_left_right = False #Swap left right frames
alpha = None
resolution = dai.MonoCameraProperties.SensorResolution.THE_400_P #THE_800_P

median = dai.StereoDepthProperties.MedianFilter.KERNEL_7x7 # 7x7

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
camRgb = pipeline.create(dai.node.ColorCamera)
spatialdetectionNetwork = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
#xoutRgb = pipeline.create(dai.node.XLinkOut)
nnOut = pipeline.create(dai.node.XLinkOut)



manip = pipeline.create(dai.node.ImageManip)

# Depth and Disparity
camLeft = pipeline.create(dai.node.MonoCamera)
camRight = pipeline.create(dai.node.MonoCamera)
stereo = pipeline.create(dai.node.StereoDepth)



# Sync node
sync = pipeline.create(dai.node.Sync)
xoutGrp = pipeline.create(dai.node.XLinkOut)
xoutGrp.setStreamName("xout")

#xoutRgb.setStreamName("rgb")

nnOut.setStreamName("nn")

# Properties
camRgb.setPreviewSize(W, H)

camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_4_K) #THE_1080_P
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.BGR) #BGR color channels rather than RGB, experimented works best less false positives. In accorfdance with model training.
camRgb.setFps(15) #40


# By Yishu
manip.initialConfig.setKeepAspectRatio(False) #True
manip.initialConfig.setResize(W, H)

# Change to RGB image than BGR - Yishu
manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p) #dai.ImgFrame.Type.RGB888p

# setMaxOutputFrameSize to avoid image bigger than max frame size error - Yishu
manip.setMaxOutputFrameSize(1228800)

# Sync node properties
sync.setSyncThreshold(timedelta(milliseconds=100))

# By Yishu
nnOut.input.setBlocking(False)


# Network specific settings
spatialdetectionNetwork.setConfidenceThreshold(confidenceThreshold)
spatialdetectionNetwork.setNumClasses(classes)
spatialdetectionNetwork.setCoordinateSize(coordinates)
spatialdetectionNetwork.setAnchors(anchors)
spatialdetectionNetwork.setAnchorMasks(anchorMasks)
spatialdetectionNetwork.setIouThreshold(iouThreshold)
spatialdetectionNetwork.setBlobPath(nnPath)
spatialdetectionNetwork.setNumInferenceThreads(2)
spatialdetectionNetwork.input.setBlocking(False)

# SpatialYolo specific parameters
spatialdetectionNetwork.setBoundingBoxScaleFactor(0.5)
spatialdetectionNetwork.setDepthLowerThreshold(100)
spatialdetectionNetwork.setDepthUpperThreshold(5000)

# Depth and Disparity Properties
if swap_left_right:
    camLeft.setCamera("right")
    camRight.setCamera("left")
else:
    camLeft.setCamera("left")
    camRight.setCamera("right")

for monoCam in (camLeft, camRight):  # Common config
    monoCam.setResolution(resolution)
    monoCam.setFps(15)

stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
stereo.initialConfig.setMedianFilter(median)  # KERNEL_7x7 default
stereo.setRectifyEdgeFillColor(0)  # Black, to better see the cutout
stereo.setLeftRightCheck(lrcheck)
stereo.setExtendedDisparity(extended)
stereo.setSubpixel(subpixel)
if alpha is not None:
    stereo.setAlphaScaling(alpha)
    config = stereo.initialConfig.get()
    config.postProcessing.brightnessFilter.minBrightness = 0
    stereo.initialConfig.set(config)

# Alignment with RGB - Ahmad
stereo.setDepthAlign(dai.CameraBoardSocket.RGB)

# # 4056x3040
stereo.setOutputSize(1248, 936)

# Linking
camRgb.isp.link(manip.inputImage)
manip.inputImage.setBlocking(False)
manip.inputImage.setQueueSize(10)
manip.out.link(spatialdetectionNetwork.input)

# By Yishu

# Linking stereo depth as an input to the spatial detection network
stereo.depth.link(spatialdetectionNetwork.inputDepth)

spatialdetectionNetwork.out.link(nnOut.input)

# Syncing NN and ISP
spatialdetectionNetwork.out.link(sync.inputs["NN_Sync"])
camRgb.isp.link(sync.inputs["ISP_Sync"])

sync.inputs["NN_Sync"].setBlocking(False)
sync.inputs["ISP_Sync"].setBlocking(False)

sync.out.link(xoutGrp.input)
xoutGrp.input.setBlocking(False)

# Depth and Disparity Linking
camLeft.out.link(stereo.left)
camRight.out.link(stereo.right)




streams = ["left", "right", "ISP","nn"] #Change by ahmad
if outRectified:
    streams.extend(["rectifiedLeft", "rectifiedRight"])
streams.append("disparity")
if depth:
    streams.append("depth")

cvColorMap = cv2.applyColorMap(np.arange(256, dtype=np.uint8), cv2.COLORMAP_JET)
cvColorMap[0] = [0, 0, 0]

# TakeUp Camera Static IP address
device_info = dai.DeviceInfo("")

try:
    # Connect to device and start pipeline
    with dai.Device(pipeline, device_info) as device:
        # Output queues will be used to get the rgb frames and nn data from the outputs defined above
        qDet = device.getOutputQueue("nn", 4, blocking=False)

        qSync = device.getOutputQueue("xout", 4, False)
        msgGrp = None

        frame = None
        m_frame = None

        Disparityframe = None
        Depthframe = None
        LeftFrame = None
        RightFrame = None

        detections = []
        startTime = time.monotonic()
        counter = 0
        color2 = (255, 255, 255)

        # nn data, being the bounding box locations, are in <0..1> range - they need to be normalized with frame width/height
        def frameNorm(frame, bbox):
            normVals = np.full(len(bbox), frame.shape[0])
            normVals[::2] = frame.shape[1]
            print(np.clip(np.array(bbox), 0, 1))
            print(normVals)
            print((np.clip(np.array(bbox), 0, 1) * normVals).astype(int))
            return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)

        def displayFrame(name, frame, detections, i):
            color_spool = (255, 0, 0)
            color_person = (0, 255, 0)
            color = ''
            
            for detection in detections:
                bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))
                if labels[detection.label] == 'person':
                    color = color_person
                else:
                    color = color_spool
                cv2.putText(frame, labels[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
                cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
                cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
                
            cv2.namedWindow(name, cv2.WINDOW_NORMAL)
            cv2.imshow(name, frame)
            print("show complete")
            

        
        j=1
        ###Create the folder that will contain capturing

        syncframe = None
        sync_detections = None

        while True:
            # Delay to match DA reading fps
            #time.sleep(4)
            print("0")
            msgGrp = qSync.get()
            print("1")
            inDet = qDet.get()
            print("4: ", inDet)
            

            if msgGrp is not None:
                print('msgGrp: ', msgGrp)
                for name, msg in msgGrp:
                    if name == "ISP_Sync":
                        syncframe = msg.getCvFrame()
                        print("syncframe: ")
                    if name == "NN_Sync":
                        sync_detections = msg.detections
                        print("sync_detections: ")
                    if syncframe is not None and sync_detections is not None:
                        print("Both sync not none")
                        displayFrame("sync_isp", syncframe, sync_detections, j)
                        print("display_complete")
                    
            
            print("9")
            print("12")
            if cv2.waitKey(1) == ord('q'):
                break

            j=j+1
            '''if j==5:
                j=1'''
        
except Exception as e:
        print(f"An error occurred: {e}")
        sys.exit(0)

Thanks,
Jaka

jakaskerl · Jul 4, 2024

Hi @yishu_corpex
I removed the sync node since it was causing issues; didn't change anything else though but the RGB and NN seem to be in sync:


import cv2
import depthai as dai
import numpy as np
import argparse
import json
import time
import sys
from pathlib import Path
import blobconverter
from datetime import timedelta




def getDisparityFrame(frame, cvColorMap):
    maxDisp = stereo.initialConfig.getMaxDisparity()
    disp = (frame * (255.0 / maxDisp)).astype(np.uint8)
    disp = cv2.applyColorMap(disp, cvColorMap)

    return disp

# parse arguments 
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", help="Provide model name or model path for inference",
                    default='models/best_ckpt_openvino_2022.1_6shave.blob', type=str)
parser.add_argument("-c", "--config", help="Provide config path for inference",
                    default='models/best_ckpt.json', type=str)
args = parser.parse_args()


# parse config
configPath = Path(args.config)
if not configPath.exists():
    raise ValueError("Path {} does not exist!".format(configPath))

with configPath.open() as f:
    config = json.load(f)
nnConfig = config.get("nn_config", {})

# parse input shape
if "input_size" in nnConfig:
    W, H = tuple(map(int, nnConfig.get("input_size").split('x')))

# extract metadata
metadata = nnConfig.get("NN_specific_metadata", {})
classes = metadata.get("classes", {})
coordinates = metadata.get("coordinates", {})
anchors = metadata.get("anchors", {})
anchorMasks = metadata.get("anchor_masks", {})
iouThreshold = metadata.get("iou_threshold", {})
confidenceThreshold = metadata.get("confidence_threshold", {})

#print(metadata)

# parse labels
nnMappings = config.get("mappings", {})
labels = nnMappings.get("labels", {})
#print("Labels: ", labels)

# get model path
nnPath = args.model
if not Path(nnPath).exists():
    print("No blob found at {}. Looking into DepthAI model zoo.".format(nnPath))
    nnPath = str(blobconverter.from_zoo(args.model, shaves = 6, zoo_type = "depthai", use_cache=True))

# sync outputs
syncNN = True

# Depth and Disparity
outRectified = False  # Output and display rectified streams
lrcheck = True  # Better handling for occlusions
extended = False  # Closer-in minimum depth, disparity range is doubled
subpixel = True  # Better accuracy for longer distance, fractional disparity 32-levels
depth = True  # Display depth frames
swap_left_right = False #Swap left right frames
alpha = None
resolution = dai.MonoCameraProperties.SensorResolution.THE_400_P #THE_800_P

median = dai.StereoDepthProperties.MedianFilter.KERNEL_7x7 # 7x7

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
camRgb = pipeline.create(dai.node.ColorCamera)
spatialdetectionNetwork = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
#xoutRgb = pipeline.create(dai.node.XLinkOut)
nnOut = pipeline.create(dai.node.XLinkOut)


# By Yishu
xoutISP = pipeline.create(dai.node.XLinkOut)
manip = pipeline.create(dai.node.ImageManip)
xoutManip = pipeline.create(dai.node.XLinkOut)

# Depth and Disparity
camLeft = pipeline.create(dai.node.MonoCamera)
camRight = pipeline.create(dai.node.MonoCamera)
stereo = pipeline.create(dai.node.StereoDepth)
xoutLeft = pipeline.create(dai.node.XLinkOut)
xoutRight = pipeline.create(dai.node.XLinkOut)
xoutDepth = pipeline.create(dai.node.XLinkOut)
xoutRectifLeft = pipeline.create(dai.node.XLinkOut)
xoutRectifRight = pipeline.create(dai.node.XLinkOut)

# Sync node


#xoutRgb.setStreamName("rgb")
xoutISP.setStreamName("ISP")
nnOut.setStreamName("nn")
xoutManip.setStreamName("Manip")

xoutLeft.setStreamName("left")
xoutRight.setStreamName("right")
xoutDepth.setStreamName("depth")
xoutRectifLeft.setStreamName("rectifiedLeft")
xoutRectifRight.setStreamName("rectifiedRight")

# Properties
camRgb.setPreviewSize(W, H)

camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P) #THE_1080_P
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.BGR) #BGR color channels rather than RGB, experimented works best less false positives. In accorfdance with model training.
camRgb.setFps(30) #40

camRgb.setIspNumFramesPool(10)

# By Yishu
manip.initialConfig.setKeepAspectRatio(False) #True
manip.initialConfig.setResize(W, H)

# Change to RGB image than BGR - Yishu
manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p) #dai.ImgFrame.Type.RGB888p

# setMaxOutputFrameSize to avoid image bigger than max frame size error - Yishu
manip.setMaxOutputFrameSize(1228800)


# By Yishu
nnOut.input.setBlocking(False)
xoutISP.input.setBlocking(False)
xoutManip.input.setBlocking(False)

# By Yishu
nnOut.input.setQueueSize(10)
xoutISP.input.setQueueSize(10)
xoutManip.input.setQueueSize(10)
spatialdetectionNetwork.input.setQueueSize(10)

# Network specific settings
spatialdetectionNetwork.setConfidenceThreshold(confidenceThreshold)
spatialdetectionNetwork.setNumClasses(classes)
spatialdetectionNetwork.setCoordinateSize(coordinates)
spatialdetectionNetwork.setAnchors(anchors)
spatialdetectionNetwork.setAnchorMasks(anchorMasks)
spatialdetectionNetwork.setIouThreshold(iouThreshold)
spatialdetectionNetwork.setBlobPath(nnPath)
spatialdetectionNetwork.setNumInferenceThreads(2)
spatialdetectionNetwork.input.setBlocking(False)

# SpatialYolo specific parameters
spatialdetectionNetwork.setBoundingBoxScaleFactor(0.5)
spatialdetectionNetwork.setDepthLowerThreshold(100)
spatialdetectionNetwork.setDepthUpperThreshold(5000)

# Depth and Disparity Properties
if swap_left_right:
    camLeft.setCamera("right")
    camRight.setCamera("left")
else:
    camLeft.setCamera("left")
    camRight.setCamera("right")

for monoCam in (camLeft, camRight):  # Common config
    monoCam.setResolution(resolution)
    monoCam.setFps(2)

stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
stereo.initialConfig.setMedianFilter(median)  # KERNEL_7x7 default
stereo.setRectifyEdgeFillColor(0)  # Black, to better see the cutout
stereo.setLeftRightCheck(lrcheck)
stereo.setExtendedDisparity(extended)
stereo.setSubpixel(subpixel)
if alpha is not None:
    stereo.setAlphaScaling(alpha)
    config = stereo.initialConfig.get()
    config.postProcessing.brightnessFilter.minBrightness = 0
    stereo.initialConfig.set(config)

# Alignment with RGB - Ahmad
stereo.setDepthAlign(dai.CameraBoardSocket.RGB)

# # 4056x3040
stereo.setOutputSize(1248, 936)

# Linking
camRgb.video.link(manip.inputImage)
manip.out.link(spatialdetectionNetwork.input)

# By Yishu
manip.out.link(xoutManip.input)

# Linking stereo depth as an input to the spatial detection network
stereo.depth.link(spatialdetectionNetwork.inputDepth)

spatialdetectionNetwork.out.link(nnOut.input)
camRgb.isp.link(xoutISP.input)

# Depth and Disparity Linking
camLeft.out.link(stereo.left)
camRight.out.link(stereo.right)
stereo.syncedLeft.link(xoutLeft.input)
stereo.syncedRight.link(xoutRight.input)
if depth:
    stereo.depth.link(xoutDepth.input)
if outRectified:
    stereo.rectifiedLeft.link(xoutRectifLeft.input)
    stereo.rectifiedRight.link(xoutRectifRight.input)

# nn data, being the bounding box locations, are in <0..1> range - they need to be normalized with frame width/height
def frameNorm(frame, bbox):
    normVals = np.full(len(bbox), frame.shape[0])
    normVals[::2] = frame.shape[1]
    print(np.clip(np.array(bbox), 0, 1))
    print(normVals)
    print((np.clip(np.array(bbox), 0, 1) * normVals).astype(int))
    return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)

def displayFrame(name, frame, detections, i):
    color_spool = (255, 0, 0)
    color_person = (0, 255, 0)
    color = ''
    
    #print("Detections: ", [[d.label, d.confidence *100, d.xmin, d.ymin, d.xmax, d.ymax] for d in detections])
    #text_file = open(text_output_path, 'w')
    for detection in detections:
        bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))
        if labels[detection.label] == 'person':
            color = color_person
        else:
            color = color_spool
        cv2.putText(frame, labels[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
        cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
        cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
        
    
    cv2.imwrite("E:\\Cameras_Live\\Test_writing_imgs\\Image{}.bmp".format(j), frame)
    cv2.namedWindow(name, cv2.WINDOW_NORMAL)
    cv2.imshow(name, frame)
    print("show complete")


streams = ["left", "right", "ISP","nn"] #Change by ahmad
if outRectified:
    streams.extend(["rectifiedLeft", "rectifiedRight"])
if depth:
    streams.append("depth")

cvColorMap = cv2.applyColorMap(np.arange(256, dtype=np.uint8), cv2.COLORMAP_JET)
cvColorMap[0] = [0, 0, 0]

# TakeUp Camera Static IP address
device_info = dai.DeviceInfo()

try:
    # Connect to device and start pipeline
    with dai.Device(pipeline, device_info) as device:
        # Output queues will be used to get the rgb frames and nn data from the outputs defined above
        qDet = device.getOutputQueue("nn", 4, blocking=False)
        qISP = device.getOutputQueue("ISP", 4, blocking=False)
        qManip = device.getOutputQueue("Manip", 4, blocking=False)

        qDepth = device.getOutputQueue("depth", 1, blocking=False)
        qLeft = device.getOutputQueue("left", 1, blocking=False)
        qRight = device.getOutputQueue("right", 1, blocking=False)

        frame = None
        m_frame = None
        Depthframe = None
        LeftFrame = None
        RightFrame = None

        detections = []
        startTime = time.monotonic()
        counter = 0
        color2 = (255, 255, 255)

        # nn data, being the bounding box locations, are in <0..1> range - they need to be normalized with frame width/height
        def frameNorm(frame, bbox):
            normVals = np.full(len(bbox), frame.shape[0])
            normVals[::2] = frame.shape[1]
            print(np.clip(np.array(bbox), 0, 1))
            print(normVals)
            print((np.clip(np.array(bbox), 0, 1) * normVals).astype(int))
            return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)

        def displayFrame(name, frame, detections, i):
            color_spool = (255, 0, 0)
            color_person = (0, 255, 0)
            color = ''
            color_blank_img = (255, 255, 255)
            blank_image = np.zeros((3040,4056,1), np.uint8)
            text_output_path = "D:\\Cameras_Live\\TakeUp\\" + "Label{}".format(j) + ".txt"
            #print("Detections: ", [[d.label, d.confidence *100, d.xmin, d.ymin, d.xmax, d.ymax] for d in detections])
            #text_file = open(text_output_path, 'w')
            for detection in detections:
                bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))
                if labels[detection.label] == 'person':
                    color = color_person
                else:
                    color = color_spool
                cv2.putText(frame, labels[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
                cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (255,255,255))
                cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
                
            
            cv2.imwrite("E:\\Cameras_Live\\Test_writing_imgs\\Image{}.bmp".format(j), frame)
            cv2.namedWindow(name, cv2.WINDOW_NORMAL)
            cv2.imshow(name, frame)
            print("show complete")
            

        
        j=1
        ###Create the folder that will contain capturing
        folder_name = "D:\\Cameras_Live\\TakeUp"
        path = Path(folder_name)

        syncframe = None
        sync_detections = None

        while True:

            ###Create the folder that will contain capturing
            

            # Delay to match DA reading fps
            #time.sleep(4)
            import time

            start_time = time.time()

            # Print and reset timing
            end_time = time.time()
            print("1 - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            # By Yishu
            inISP = qISP.get()

            end_time = time.time()
            print("2: ", inISP, " - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            inManip = qManip.get()

            end_time = time.time()
            print("3: ", inManip, " - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            inDet = qDet.get()

            end_time = time.time()
            print("4: ", inDet, " - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            inLeft = qLeft.get()

            end_time = time.time()
            print("5 - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            inRight = qRight.get()

            end_time = time.time()
            print("6 - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            inDepth = qDepth.get()

            end_time = time.time()
            print("7 - Time taken: {:.6f} seconds".format(end_time - start_time))
            start_time = end_time

            displayFrame("Color", inISP.getCvFrame(), inDet.detections, j)

    

            end_time = time.time()
            print("8 - Time taken: {:.6f} seconds".format(end_time - start_time))

            
            if inManip is not None:
                m_frame = inManip.getCvFrame()
            
            print("9")
            
            if inLeft is not None:
                LeftFrame = inLeft.getCvFrame()
                #cv2.imwrite("D:\\Cameras_Live\\TakeUp\\Left{}.bmp".format(j),LeftFrame)
            
            print("10")
            if inRight is not None:
                RightFrame = inRight.getCvFrame()
                #cv2.imwrite("D:\\Cameras_Live\\TakeUp\\Right{}.bmp".format(j),RightFrame)

            print("11")
            if inDepth is not None:
                Depthframe = inDepth.getCvFrame()
                Depthframe = Depthframe.astype(np.uint16)
                #cv2.imwrite("D:\\Cameras_Live\\TakeUp\\Depth{}.png".format(j),Depthframe)

            print("12")
            if cv2.waitKey(1) == ord('q'):
                break

            j=j+1
            '''if j==5:
                j=1'''
        
except Exception as e:
        print(f"An error occurred: {e}")
        sys.exit(0)

jakaskerl · Jul 4, 2024

Hi @yishu_corpex

yishu_corpex Let me know how can this be resolved and of you want I can share with you the MRE for this issue so you can test it at your end as well.

Yes, this would be best so I can quickly just alter the code.

Thanks,
Jaka

jakaskerl · Jun 27, 2024

Hi @yishu_corpex

config = dai.SpatialLocationCalculatorConfigData()
config.depthThresholds.lowerThreshold = 100
config.depthThresholds.upperThreshold = 10000
calculationAlgorithm = dai.SpatialLocationCalculatorAlgorithm.MIN
config.roi = dai.Rect(topLeft, bottomRight)

Thanks,
Jaka

jakaskerl · Jun 26, 2024

Hi @yishu_corpex

https://docs.luxonis.com/hardware/platform/depth/depth-accuracy/

jakaskerl · Jun 25, 2024

Hi @yishu_corpex

yishu_corpex I wanted to check with you if you can provide any update on the resolution of this issue.

Can't say for certain, our FW devs are working on it.

yishu_corpex Please let me know as it is becoming a major bottleneck for many of our projects.

Why not just use the preview? Do you absolutely need the full FOV? You can do something like this to still preserve the FOV:

# Properties
camRgb.setBoardSocket(dai.CameraBoardSocket.CAM_A)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_12_MP)
camRgb.setIspScale(1, 4)

w, h = camRgb.getIspSize()
camRgb.setPreviewSize(640, 400)
camRgb.setPreviewKeepAspectRatio(False)

xoutVideo.input.setBlocking(False)
xoutVideo.input.setQueueSize(1)

# Linking
camRgb.preview.link(xoutVideo.input)

Thanks,
Jaka

jakaskerl · Jun 16, 2024

Hi @yishu_corpex
In case of SpatialLocationCalculator, the BBOX you see in the examples is the group of points that are taken into account when calculating the depth. The depth is given by an averaging method (usually MEDIAN, but can be MIN/MAX, average, ...).

Same thing goes for SpatialDetectionNetworks, the detected bounding box of an object is first scaled by BBoxScalingFactor, then the points inside that new BBOX are used to calculate the depth across those pixels. X and Y are taken from the centroid of that BBOX.

Thanks,
Jaka

jakaskerl · Jun 11, 2024

Hi @yishu_corpex
Yep, should be the same generally (apart from the specifics). There is no document on the structure, but you can follow some examples:
luxonis/depthaitree/main/depthai_sdk/src/depthai_sdk/nn_models

Thanks,
Jaka

jakaskerl · May 29, 2024

Hi @yishu_corpex
Same error: https://discuss.luxonis.com/d/4723-depthai-script-stuck-when-using-12mp-resolution/11

jakaskerl · May 28, 2024

Hi @yishu_corpex
Are you sure no frames come through? Most loop cycles will show NONE because tryGet will return it when it doesn't have any frames waiting.

yishu_corpex
while True:
###Create the folder that will contain capturing
folder_name = "D:\Camera_Setup_Caliberation_Data"
path = Path(folder_name)
path.mkdir(parents=True, exist_ok=True)

This is bad practice, put it outside while loop.

Thanks,
Jaka

jakaskerl · May 28, 2024

Hi @yishu_corpex
The there is a problem with schema dump generated by depthai. The node SpatialDetectionNetwork does not have outNetwork output so the graph fails. This has to be added in the FW.

Thanks,
Jaka

jakaskerl · May 19, 2024

Hi @yishu_corpex

yishu_corpex I have other scripts where I use the same model and same configurations with the Manip node.

Seems like a scaling issue from 12MP to W/H when using keepAspectRatio(False).

yishu_corpex Also, how can we track the progress of the resolution of this issue?

It's internal so you can't track it, but you can check the bugfixes section under new releases: luxonis/depthai-pythonreleases/tag/v2.25.1.0

Thanks,
Jaka

jakaskerl · May 17, 2024

Hi @yishu_corpex
It's a depthai FW issue it seems, since the manip can't process the 12MP frames in certain W/H configurations. I have forwarded it to the dev team.

Thanks,
Jaka