• Community
  • ToF 12MP Person Segmentation problem

Dear Luxonis Community,

As a generalist I am not a really good programmer and I wanted kindly to ask you for help.
I wanted to replace the Kinect Azure and V2 and support the really great OAK Cameras for low budget art
purposes.
Unfortunately I am the only person that does all the videodesign/3D etc. and also the coding stuff, so my knowledge about coding is "ok" but sometimes I have gaps where I don't know how to go further.

I bound in my modified ToF PoE Sensor with 12MP Camera together with the segmentation example and tried to replace the depth cam with the ToF Cam.
My goal is to send out from my PoE sensor an
- as high as possible Color image,
- an alpha (black and white) (cutout of one or multiple persons silhouette)
- and probably a cutout of a combinded (color+cutout) of one or multiple persons silhouettes as clean as possible - or the whole raw ToF Depth Channel (for further work)
Those 2 or 3 video streams would go over NDI in ideal form from scratch only starting the software and in as high as possible image quality but still not lagging in FPS sent through the network.

I kind of achieved to start the program but my knowledge stops here in terms of the next steps how to go further and I somehow did something very wrong.
The image (because it had to cropped for blobtrack??) is now smaller, I wished it could be as "uncropped" as possible (to use the most of the area) and the canvas and my body recognition kinda works (not very well, but works) and it does not seem to use the ToF Depth channel because I maybe did something wrong with my color channels.
Also my fps is at around 9FPS and it does not seem to be a network problem.
I was hoping that combining ToF and Color Channel would give me a much cleaner segmentation also in
low or no light situations.

Here is my code I tried to write/combine.
Link to code ToF_Depth_Segmentation

I would be very, very thankful for any help and sorry for asking beginners questions, I am really slow
in reading and understanding the code snippets.
Hope that if anybody has time to help me that I can barely implement your corrections without asking to many additional questions.

Thank you again very much for any help.

With kind regards
Bonko

(My Video is upside down because I have a custom build of the ToF Sensor, I could of course turn it 180degrees afterwards or before but I don't know if this mixes up the calibration?? because my custom build has a slightly rotated ToF picture)
Screenshot_Try_ToF_DepthSegmentation

    BonkoKaradjov

    1. Upscale the segmentation results to ISP image, not the other way around, since you are cutting out large portions of FOV. Since the aspect ratios are different, keep that in mind when scaling (use the height scaling factor for width as well). The input to NN should incorporate the whole RGB image FOV for this to work.
    2. Normalize the second image to UINT8

    Thanks,
    Jaka

    9 days later

    Dear Jaka, thank you very much for your help. Unfortunately I am not really good at coding or need more time to work myself into depthai to follow your advice. I tried another approach (down below).

    But thank you very much for your help and response 🙂

    My goal is to work out 2 video streams.
    One color and one ToF and send them through network (NDI) with low latency and synchronized.
    Color should be as high quality as the network allows and ToF depth with enough information to
    be usable to color ramp out regions in Touchdesigner.

    I find it comfortable to change the settings by different key press (later on I will bind it into some kind of Tkinter interface and save the initial data into a json file to load up saved configurations).

    My code below is asynch and not very fast through network (and I am aware that combining two codes like this is not the best but if there is any step by step tutorial out there or help I would be very thankful.
    I am still to stupid to read simple things as what drives the fps of the images, where exactly ends the image creation and where to combine the streams etc.

    I had synchronized approaches but they had to much delay when they went through the network and for arts purposes I need them nearly realtime and in good quality.
    I also know the video encoder solution but do not really understand how it works when streaming only data not recording...etc.

    Thanks to everyone for any help

    `import time
    import depthai as dai
    import cv2
    from itertools import cycle
    import numpy as np

    #Step size ('W','A','S','D' controls)
    STEP_SIZE = 8
    #Manual exposure/focus/white-balance set step
    EXP_STEP = 500 # us
    ISO_STEP = 50
    LENS_STEP = 3
    WB_STEP = 200

    #===> Tof Color Map
    cvColorMap = cv2.applyColorMap(np.arange(256, dtype=np.uint8), cv2.COLORMAP_JET)
    cvColorMap[0] = [0, 0, 0]
    #<===

    def clamp(num, v0, v1):
    return max(v0, min(num, v1))

    #Create pipeline for Rgb and ToF
    pipeline = dai.Pipeline()

    #Define sources and outputs
    camRgb = pipeline.create(dai.node.ColorCamera)
    camRgb.setBoardSocket(dai.CameraBoardSocket.CAM_C)
    camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_12_MP)
    camRgb.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
    camRgb.setIspScale(1,2) # 1080P -> 720P
    stillEncoder = pipeline.create(dai.node.VideoEncoder)

    controlIn = pipeline.create(dai.node.XLinkIn)
    configIn = pipeline.create(dai.node.XLinkIn)
    ispOut = pipeline.create(dai.node.XLinkOut)
    videoOut = pipeline.create(dai.node.XLinkOut)
    stillMjpegOut = pipeline.create(dai.node.XLinkOut)

    controlIn.setStreamName('control')
    configIn.setStreamName('config')
    ispOut.setStreamName('isp')
    videoOut.setStreamName('video')
    stillMjpegOut.setStreamName('still')

    #Properties
    #Small Video cropped
    camRgb.setVideoSize(640,360)
    stillEncoder.setDefaultProfilePreset(1, dai.VideoEncoderProperties.Profile.MJPEG)

    #Linking
    camRgb.isp.link(ispOut.input)
    camRgb.still.link(stillEncoder.input)
    camRgb.video.link(videoOut.input)
    controlIn.out.link(camRgb.inputControl)
    configIn.out.link(camRgb.inputConfig)
    stillEncoder.bitstream.link(stillMjpegOut.input)

    #=========> ToF Pipeline Begin

    tof = pipeline.create(dai.node.ToF)

    #Configure the ToF node
    tofConfig = tof.initialConfig.get()

    #Optional. Best accuracy, but adds motion blur.
    #see ToF node docs on how to reduce/eliminate motion blur.
    tofConfig.enableOpticalCorrection = True
    tofConfig.enablePhaseShuffleTemporalFilter = True
    tofConfig.phaseUnwrappingLevel = 1
    tofConfig.phaseUnwrapErrorThreshold = 25

    tofConfig.enableTemperatureCorrection = False # Not yet supported

    xinTofConfig = pipeline.create(dai.node.XLinkIn)
    xinTofConfig.setStreamName("tofConfig")
    xinTofConfig.out.link(tof.inputConfig)

    tof.initialConfig.set(tofConfig)

    cam_tof = pipeline.create(dai.node.Camera)
    cam_tof.setFps(120) # ToF node will produce depth frames at /2 of this rate
    cam_tof.setBoardSocket(dai.CameraBoardSocket.CAM_A)
    cam_tof.raw.link(tof.input)
    cam_tof.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)

    xout = pipeline.create(dai.node.XLinkOut)
    xout.setStreamName("depth")
    tof.depth.link(xout.input)

    tofConfig = tof.initialConfig.get()

    #<================== End ToF Pipeline

    #Connect to device and start pipeline
    with dai.Device(pipeline) as device:

    #Get data queues
    controlQueue = device.getInputQueue('control')
    configQueue = device.getInputQueue('config')
    ispQueue = device.getOutputQueue('isp')
    videoQueue = device.getOutputQueue('video')
    stillQueue = device.getOutputQueue('still')
    
    # Max cropX & cropY
    maxCropX = (camRgb.getIspWidth() - camRgb.getVideoWidth()) / camRgb.getIspWidth()
    maxCropY = (camRgb.getIspHeight() - camRgb.getVideoHeight()) / camRgb.getIspHeight()
    print(maxCropX, maxCropY, camRgb.getIspWidth(), camRgb.getVideoHeight())
    
    # Default crop
    cropX = 0
    cropY = 0
    sendCamConfig = True
    
    # Defaults and limits for manual focus/exposure controls
    lensPos = 150
    expTime = 20000
    sensIso = 800
    wbManual = 4000
    ae_comp = 0
    ae_lock = False
    awb_lock = False
    saturation = 0
    contrast = 0
    brightness = 0
    sharpness = 0
    luma_denoise = 0
    chroma_denoise = 0
    control = 'none'
    show = False
    
    awb_mode = cycle([item for name, item in vars(dai.CameraControl.AutoWhiteBalanceMode).items() if name.isupper()])
    anti_banding_mode = cycle([item for name, item in vars(dai.CameraControl.AntiBandingMode).items() if name.isupper()])
    effect_mode = cycle([item for name, item in vars(dai.CameraControl.EffectMode).items() if name.isupper()])
    
    
    #=====> ToF Data get
    qDepth = device.getOutputQueue(name="depth")
    
    tofConfigInQueue = device.getInputQueue("tofConfig")
    
    counter = 0
    #<===== ToF Data get END
    while True:
    
        #====> ToF Camera Features
        start = time.time()
        key = cv2.waitKey(1)
        if key == ord('f'):
            tofConfig.enableFPPNCorrection = not tofConfig.enableFPPNCorrection
            tofConfigInQueue.send(tofConfig)
        elif key == ord('o'):
            tofConfig.enableOpticalCorrection = not tofConfig.enableOpticalCorrection
            tofConfigInQueue.send(tofConfig)
        elif key == ord('w'):
            tofConfig.enableWiggleCorrection = not tofConfig.enableWiggleCorrection
            tofConfigInQueue.send(tofConfig)
        elif key == ord('t'):
            tofConfig.enableTemperatureCorrection = not tofConfig.enableTemperatureCorrection
            tofConfigInQueue.send(tofConfig)
        elif key == ord('q'):
            break
        elif key == ord('0'):
            tofConfig.enablePhaseUnwrapping = False
            tofConfig.phaseUnwrappingLevel = 0
            tofConfigInQueue.send(tofConfig)
        elif key == ord('1'):
            tofConfig.enablePhaseUnwrapping = True
            tofConfig.phaseUnwrappingLevel = 1
            tofConfigInQueue.send(tofConfig)
        elif key == ord('2'):
            tofConfig.enablePhaseUnwrapping = True
            tofConfig.phaseUnwrappingLevel = 2
            tofConfigInQueue.send(tofConfig)
        elif key == ord('3'):
            tofConfig.enablePhaseUnwrapping = True
            tofConfig.phaseUnwrappingLevel = 3
            tofConfigInQueue.send(tofConfig)
        elif key == ord('4'):
            tofConfig.enablePhaseUnwrapping = True
            tofConfig.phaseUnwrappingLevel = 4
            tofConfigInQueue.send(tofConfig)
        elif key == ord('5'):
            tofConfig.enablePhaseUnwrapping = True
            tofConfig.phaseUnwrappingLevel = 5
            tofConfigInQueue.send(tofConfig)
        elif key == ord('z'):
            medianSettings = [dai.MedianFilter.MEDIAN_OFF, dai.MedianFilter.KERNEL_3x3, dai.MedianFilter.KERNEL_5x5,
                              dai.MedianFilter.KERNEL_7x7]
            currentMedian = tofConfig.median
            nextMedian = medianSettings[(medianSettings.index(currentMedian) + 1) % len(medianSettings)]
            print(f"Changing median to {nextMedian.name} from {currentMedian.name}")
            tofConfig.median = nextMedian
            tofConfigInQueue.send(tofConfig)
    
        imgFrame = qDepth.get()  # blocking call, will wait until a new data has arrived
        depth_map = imgFrame.getFrame()
        max_depth = (tofConfig.phaseUnwrappingLevel + 1) * 1500  # 100MHz modulation freq.
        depth_colorized = np.interp(depth_map, (0, max_depth), (0, 255)).astype(np.uint8)
        depth_colorized = cv2.applyColorMap(depth_colorized, cvColorMap)
    
        cv2.imshow("Colorized depth", depth_colorized)
        counter += 1
        # <==== ToF Camera Features and Show END
    
        vidFrames = videoQueue.tryGetAll()
        for vidFrame in vidFrames:
            # Showing CROPPED Frame Window
            cv2.imshow('video', vidFrame.getCvFrame())
    
        ispFrames = ispQueue.tryGetAll()
        for ispFrame in ispFrames:
            if show:
                txt = f"[{ispFrame.getSequenceNum()}] "
                txt += f"Exposure: {ispFrame.getExposureTime().total_seconds()*1000:.3f} ms, "
                txt += f"ISO: {ispFrame.getSensitivity()}, "
                txt += f"Lens position: {ispFrame.getLensPosition()}, "
                txt += f"Color temp: {ispFrame.getColorTemperature()} K"
                print(txt)
            #Showing original color image
            cv2.imshow('isp', ispFrame.getCvFrame())
    
            # Send new cfg to camera
            if sendCamConfig:
                cfg = dai.ImageManipConfig()
                cfg.setCropRect(cropX, cropY, 0, 0)
                configQueue.send(cfg)
                print('Sending new crop - x: ', cropX, ' y: ', cropY)
                sendCamConfig = False
    
        stillFrames = stillQueue.tryGetAll()
        for stillFrame in stillFrames:
            # Decode JPEG
            frame = cv2.imdecode(stillFrame.getData(), cv2.IMREAD_UNCHANGED)
            # Display
            cv2.imshow('still', frame)
    
        # Update screen (1ms pooling rate)
        key = cv2.waitKey(1)
        if key == ord('q'):
            break
        elif key == ord('/'):
            show = not show
            if not show: print("Printing camera settings: OFF")
        elif key == ord('c'):
            ctrl = dai.CameraControl()
            ctrl.setCaptureStill(True)
            controlQueue.send(ctrl)
        elif key == ord('t'):
            print("Autofocus trigger (and disable continuous)")
            ctrl = dai.CameraControl()
            ctrl.setAutoFocusMode(dai.CameraControl.AutoFocusMode.AUTO)
            ctrl.setAutoFocusTrigger()
            controlQueue.send(ctrl)
        elif key == ord('f'):
            print("Autofocus enable, continuous")
            ctrl = dai.CameraControl()
            ctrl.setAutoFocusMode(dai.CameraControl.AutoFocusMode.CONTINUOUS_VIDEO)
            controlQueue.send(ctrl)
        elif key == ord('e'):
            print("Autoexposure enable")
            ctrl = dai.CameraControl()
            ctrl.setAutoExposureEnable()
            controlQueue.send(ctrl)
        elif key == ord('b'):
            print("Auto white-balance enable")
            ctrl = dai.CameraControl()
            ctrl.setAutoWhiteBalanceMode(dai.CameraControl.AutoWhiteBalanceMode.AUTO)
            controlQueue.send(ctrl)
        elif key in [ord(','), ord('.')]:
            if key == ord(','): lensPos -= LENS_STEP
            if key == ord('.'): lensPos += LENS_STEP
            lensPos = clamp(lensPos, 0, 255)
            print("Setting manual focus, lens position: ", lensPos)
            ctrl = dai.CameraControl()
            ctrl.setManualFocus(lensPos)
            controlQueue.send(ctrl)
        elif key in [ord('i'), ord('o'), ord('k'), ord('l')]:
            if key == ord('i'): expTime -= EXP_STEP
            if key == ord('o'): expTime += EXP_STEP
            if key == ord('k'): sensIso -= ISO_STEP
            if key == ord('l'): sensIso += ISO_STEP
            expTime = clamp(expTime, 1, 33000)
            sensIso = clamp(sensIso, 100, 1600)
            print("Setting manual exposure, time: ", expTime, "iso: ", sensIso)
            ctrl = dai.CameraControl()
            ctrl.setManualExposure(expTime, sensIso)
            controlQueue.send(ctrl)
        elif key in [ord('n'), ord('m')]:
            if key == ord('n'): wbManual -= WB_STEP
            if key == ord('m'): wbManual += WB_STEP
            wbManual = clamp(wbManual, 1000, 12000)
            print("Setting manual white balance, temperature: ", wbManual, "K")
            ctrl = dai.CameraControl()
            ctrl.setManualWhiteBalance(wbManual)
            controlQueue.send(ctrl)
        elif key in [ord('w'), ord('a'), ord('s'), ord('d')]:
            if key == ord('a'):
                cropX = cropX - (maxCropX / camRgb.getResolutionWidth()) * STEP_SIZE
                if cropX < 0: cropX = 0
            elif key == ord('d'):
                cropX = cropX + (maxCropX / camRgb.getResolutionWidth()) * STEP_SIZE
                if cropX > maxCropX: cropX = maxCropX
            elif key == ord('w'):
                cropY = cropY - (maxCropY / camRgb.getResolutionHeight()) * STEP_SIZE
                if cropY < 0: cropY = 0
            elif key == ord('s'):
                cropY = cropY + (maxCropY / camRgb.getResolutionHeight()) * STEP_SIZE
                if cropY > maxCropY: cropY = maxCropY
            sendCamConfig = True
        elif key == ord('1'):
            awb_lock = not awb_lock
            print("Auto white balance lock:", awb_lock)
            ctrl = dai.CameraControl()
            ctrl.setAutoWhiteBalanceLock(awb_lock)
            controlQueue.send(ctrl)
        elif key == ord('2'):
            ae_lock = not ae_lock
            print("Auto exposure lock:", ae_lock)
            ctrl = dai.CameraControl()
            ctrl.setAutoExposureLock(ae_lock)
            controlQueue.send(ctrl)
        elif key >= 0 and chr(key) in '34567890[]':
            if   key == ord('3'): control = 'awb_mode'
            elif key == ord('4'): control = 'ae_comp'
            elif key == ord('5'): control = 'anti_banding_mode'
            elif key == ord('6'): control = 'effect_mode'
            elif key == ord('7'): control = 'brightness'
            elif key == ord('8'): control = 'contrast'
            elif key == ord('9'): control = 'saturation'
            elif key == ord('0'): control = 'sharpness'
            elif key == ord('['): control = 'luma_denoise'
            elif key == ord(']'): control = 'chroma_denoise'
            print("Selected control:", control)
        elif key in [ord('-'), ord('_'), ord('+'), ord('=')]:
            change = 0
            if key in [ord('-'), ord('_')]: change = -1
            if key in [ord('+'), ord('=')]: change = 1
            ctrl = dai.CameraControl()
            if control == 'none':
                print("Please select a control first using keys 3..9 0 [ ]")
            elif control == 'ae_comp':
                ae_comp = clamp(ae_comp + change, -9, 9)
                print("Auto exposure compensation:", ae_comp)
                ctrl.setAutoExposureCompensation(ae_comp)
            elif control == 'anti_banding_mode':
                abm = next(anti_banding_mode)
                print("Anti-banding mode:", abm)
                ctrl.setAntiBandingMode(abm)
            elif control == 'awb_mode':
                awb = next(awb_mode)
                print("Auto white balance mode:", awb)
                ctrl.setAutoWhiteBalanceMode(awb)
            elif control == 'effect_mode':
                eff = next(effect_mode)
                print("Effect mode:", eff)
                ctrl.setEffectMode(eff)
            elif control == 'brightness':
                brightness = clamp(brightness + change, -10, 10)
                print("Brightness:", brightness)
                ctrl.setBrightness(brightness)
            elif control == 'contrast':
                contrast = clamp(contrast + change, -10, 10)
                print("Contrast:", contrast)
                ctrl.setContrast(contrast)
            elif control == 'saturation':
                saturation = clamp(saturation + change, -10, 10)
                print("Saturation:", saturation)
                ctrl.setSaturation(saturation)
            elif control == 'sharpness':
                sharpness = clamp(sharpness + change, 0, 4)
                print("Sharpness:", sharpness)
                ctrl.setSharpness(sharpness)
            elif control == 'luma_denoise':
                luma_denoise = clamp(luma_denoise + change, 0, 4)
                print("Luma denoise:", luma_denoise)
                ctrl.setLumaDenoise(luma_denoise)
            elif control == 'chroma_denoise':
                chroma_denoise = clamp(chroma_denoise + change, 0, 4)
                print("Chroma denoise:", chroma_denoise)
                ctrl.setChromaDenoise(chroma_denoise)
            controlQueue.send(ctrl)
    # ===> ToF Additional device close
    device.close()
    # <=== ToF Additional device close``

    @jakaskerl

    Here is my tryout for the Code with your Changes. It kinda works but PoE has a big latency and I totally fail in binding in the video encoder process for both Depthmap and RGB Video that promises to be send much faster through the network.

    import numpy as np
    import cv2
    import depthai as dai
    
    from datetime import timedelta
    import numpy as np
    from numba import jit, prange
    
    import NDIlib as ndi
    
    ##Globals
    #offsetDepth Values
    
    def offsetImageFind(img,rows,cols):
        print('X Value: ')
        offsetX = int(input())
        print('Y Value: ')
        offsetY = int(input())
    
        M = np.float32([[1, 0, offsetX], [0, 1, offsetY]])
        img = cv2.warpAffine(img, M, (cols, rows))
        return img
    
    def offsetImageDo(img,rows,cols, offsetX,offsetY):
    
        M = np.float32([[1, 0, offsetX], [0, 1, offsetY]])
        img = cv2.warpAffine(img, M, (cols, rows))
        return img
    
    
    
    @jit(nopython=True, parallel=True)
    def reprojection(depth_image, depth_camera_intrinsics, camera_extrinsics, color_camera_intrinsics, depth_image_show = None):
        height = len(depth_image)
        width = len(depth_image[0])
        if depth_image_show is not None:
            image = np.zeros((height, width), np.uint8)
        else:
            image = np.zeros((height, width), np.uint16)
        if(camera_extrinsics[0][3] > 0):
            sign = 1
        else:
            sign = -1
        for i in prange(0, height):
            for j in prange(0, width):
                if sign == 1:
                    #Reverse the order of the pixels
                    j = width - j - 1
                d = depth_image[i][j]
                if(d == 0):
                    continue
                #Convert pixel to 3d point
                x = (j - depth_camera_intrinsics[0][2]) * d / depth_camera_intrinsics[0][0]
                y = (i - depth_camera_intrinsics[1][2]) * d / depth_camera_intrinsics[1][1]
                z = d
    
                #Move the point to the camera frame
                x1 = camera_extrinsics[0][0] * x + camera_extrinsics[0][1] * y + camera_extrinsics[0][2] * z + camera_extrinsics[0][3]
                y1 = camera_extrinsics[1][0] * x + camera_extrinsics[1][1] * y + camera_extrinsics[1][2] * z + camera_extrinsics[1][3]
                z1 = camera_extrinsics[2][0] * x + camera_extrinsics[2][1] * y + camera_extrinsics[2][2] * z + camera_extrinsics[2][3]
    
                u = color_camera_intrinsics[0][0] * (x1  / z1) + color_camera_intrinsics[0][2]
                v = color_camera_intrinsics[1][1] * (y1  / z1) + color_camera_intrinsics[1][2]
                int_u = round(u)
                int_v = round(v)
                #if(int_v != i):
                #    print(f'v -> {v} and i -> {i}') # This should never be printed
                if int_u >= 0 and int_u < (len(image[0]) - 1) and int_v >= 0 and int_v < len(image):
                    if depth_image_show is not None:
                        image[int_v][int_u] = depth_image_show[i][j][0]
                        image[int_v][int_u + sign] = depth_image_show[i][j][0]
                    else:
                        image[int_v][int_u] = z1
                        image[int_v][int_u + sign] = z1
        return image
    FPS = 30
    
    RGB_SOCKET = dai.CameraBoardSocket.CAM_C
    LEFT_SOCKET = dai.CameraBoardSocket.CAM_A
    
    ###NDI Globals
    ndi_name_1 = 'BkNDI_Color'
    ndi_name_2 = 'BkNDI_TOF'
    
    COLOR_RESOLUTION = dai.ColorCameraProperties.SensorResolution.THE_12_MP
    tofSize = (640, 480)
    rgbSize = (1352, 1012)
    
    FPPN_ENABLE = True
    WIGGLE_ENABLE = True
    TEMPERATURE_ENABLE = False
    OPTICAL_ENABLE = True
    
    #Used for colorizing the depth map
    MIN_DEPTH = 500  # mm
    MAX_DEPTH = 10000  # mm
    
    
    device = dai.Device()
    
    try:
        calibData = device.readCalibration2()
        M1 = np.array(calibData.getCameraIntrinsics(LEFT_SOCKET, *tofSize))
        D1 = np.array(calibData.getDistortionCoefficients(LEFT_SOCKET))
        M2 = np.array(calibData.getCameraIntrinsics(RGB_SOCKET, *rgbSize))
        D2 = np.array(calibData.getDistortionCoefficients(RGB_SOCKET))
    
        T = (
            np.array(calibData.getCameraTranslationVector(LEFT_SOCKET, RGB_SOCKET, False))
            * 10
        )  # to mm for matching the depth
        R = np.array(calibData.getCameraExtrinsics(LEFT_SOCKET, RGB_SOCKET, False))[
            0:3, 0:3
        ]
        TARGET_MATRIX = M1
    
        lensPosition = calibData.getLensPosition(RGB_SOCKET)
    except:
        raise
    
    
    pipeline = dai.Pipeline()
    
    #Define sources and outputs
    camRgb = pipeline.create(dai.node.ColorCamera)
    tofIn = pipeline.create(dai.node.Camera)
    tof = pipeline.create(dai.node.ToF)
    sync = pipeline.create(dai.node.Sync)
    out = pipeline.create(dai.node.XLinkOut)
    
    
    tofIn.setBoardSocket(LEFT_SOCKET)
    tofIn.properties.numFramesPoolRaw = 5
    tofIn.setFps(FPS)
    tofIn.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
    
    tofConfig = tof.initialConfig.get()
    #tofConfig.depthParams.freqModUsed = dai.RawToFConfig.DepthParams.TypeFMod.MIN
    #tofConfig.depthParams.freqModUsed = dai.RawToFConfig.DepthParams.TypeFMod.MAX
    #tofConfig.depthParams.avgPhaseShuffle = False
    #tofConfig.depthParams.minimumAmplitude = 3.0
    tofConfig.enableFPPNCorrection = FPPN_ENABLE
    tofConfig.enableOpticalCorrection = OPTICAL_ENABLE
    tofConfig.enableWiggleCorrection = WIGGLE_ENABLE
    tofConfig.enableTemperatureCorrection = TEMPERATURE_ENABLE
    #tofConfig.median = dai.MedianFilter.KERNEL_5x5
    
    tofConfig.enablePhaseUnwrapping = True
    tofConfig.phaseUnwrappingLevel = 4
    tof.initialConfig.set(tofConfig)
    
    camRgb.setBoardSocket(RGB_SOCKET)
    camRgb.setResolution(COLOR_RESOLUTION)
    camRgb.setFps(FPS)
    camRgb.setIspScale(1,3)
    #camRgb.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
    
    out.setStreamName("out")
    
    sync.setSyncThreshold(timedelta(milliseconds=20))
    
    #Linking
    camRgb.isp.link(sync.inputs["rgb"])
    tofIn.raw.link(tof.input)
    tof.depth.link(sync.inputs["depth"])
    sync.out.link(out.input)
    
    #######################################
    ###NDI Config BEGIN
    #if not ndi.initialize():
    #return 0
    
    #Try NDI Send 1
    send_settings1 = ndi.SendCreate()
    send_settings1.ndi_name = ndi_name_1
    ndi_send_1 = ndi.send_create(send_settings1)
    ndi.SendCreate('Name')
    video_frame_1 = ndi.VideoFrameV2()
    
    #Try NDI Send 2
    send_settings2 = ndi.SendCreate()
    send_settings2.ndi_name = ndi_name_2
    ndi_send_2 = ndi.send_create(send_settings2)
    ndi.SendCreate('Name')
    video_frame_2 = ndi.VideoFrameV2()
    
    ###NDI Config END
    #######################################
    
    
    def colorizeDepth(frameDepth, minDepth=MIN_DEPTH, maxDepth=MAX_DEPTH):
        invalidMask = frameDepth == 0
        depthFrameColor = np.interp(frameDepth, (minDepth, maxDepth), (0, 255)).astype(
            np.uint8
        )
        depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)
        #Set invalid depth pixels to black
        depthFrameColor[invalidMask] = 0
        return depthFrameColor
    
    
    def getAlignedDepth(frameDepth):
        R1, R2, _, _, _, _, _ = cv2.stereoRectify(M1, D1, M2, D2, (100, 100), R, T)  # The (100,100) doesn't matter as it is not used for calculating the rotation matrices
        leftMapX, leftMapY = cv2.initUndistortRectifyMap(M1, None, R1, TARGET_MATRIX, tofSize, cv2.CV_32FC1)
        depthRect = cv2.remap(frameDepth, leftMapX, leftMapY, cv2.INTER_NEAREST)
        newR = np.dot(R2, np.dot(R, R1.T))  # Should be very close to identity
        newT = np.dot(R2, T)
        combinedExtrinsics = np.eye(4)
        combinedExtrinsics[0:3, 0:3] = newR
        combinedExtrinsics[0:3, 3] = newT
        depthAligned = reprojection(depthRect, TARGET_MATRIX, combinedExtrinsics, TARGET_MATRIX)
        #Rotate the depth to the RGB frame
        R_back = R2.T
        mapX, mapY = cv2.initUndistortRectifyMap(TARGET_MATRIX, None, R_back, M2, rgbSize, cv2.CV_32FC1)
        outputAligned = cv2.remap(depthAligned, mapX, mapY, cv2.INTER_NEAREST)
        return outputAligned
    
    
    rgbWeight = 0.5
    depthWeight = 0.5
    
    
    def updateBlendWeights(percent_rgb):
        """
        Update the rgb and depth weights used to blend depth/rgb image
        @param[in] percent_rgb The rgb weight expressed as a percentage (0..100)
        """
        global depthWeight
        global rgbWeight
        rgbWeight = float(percent_rgb) / 100.0
        depthWeight = 1.0 - rgbWeight
    
    
    #Connect to device and start pipeline
    with device:
        device.startPipeline(pipeline)
        queue = device.getOutputQueue("out", 8, False)
    
        # Configure windows; trackbar adjusts blending ratio of rgb/depth
        rgb_depth_window_name = "rgb-depth"
    
        cv2.namedWindow(rgb_depth_window_name)
        cv2.createTrackbar(
            "RGB Weight %",
            rgb_depth_window_name,
            int(rgbWeight * 100),
            100,
            updateBlendWeights,
        )
        while True:
            messageGroup: dai.MessageGroup = queue.get()
            frameRgb: dai.ImgFrame = messageGroup["rgb"]
            frameDepth: dai.ImgFrame = messageGroup["depth"]
            # Blend when both received
            if frameRgb is not None and frameDepth is not None:
                frameRgb = frameRgb.getCvFrame()
    
                cv2.imshow("rgb", frameRgb)
                cv2.imshow("depth", colorizeDepth(frameDepth.getCvFrame()))
    
                alignedDepth = getAlignedDepth(frameDepth.getFrame())
                # visualise Point Cloud
                rgb = cv2.cvtColor(frameRgb, cv2.COLOR_BGR2RGB)
    
                # Colorize the aligned depth
                alignedDepth = colorizeDepth(alignedDepth)
    
                # Offset Depth Find Out
                #alignedDepth = offsetImageFind(alignedDepth,1080,1920)
                #alignedDepth = offsetImageDo(alignedDepth,1080,1920,40,-33)
    
    
                # Undistort the RGB frame
                mapX, mapY = cv2.initUndistortRectifyMap(
                    M2, D2, None, M2, rgbSize, cv2.CV_32FC1
                )
                frameRgb = cv2.remap(frameRgb, mapX, mapY, cv2.INTER_LINEAR)
    
                # rotage depth
                scaleer = 0.7
                frameRgb = cv2.rotate(frameRgb, cv2.ROTATE_180)
                frameRgb = cv2.resize(frameRgb,(0,0),fx= scaleer, fy = scaleer)
                alignedDepth = cv2.resize(alignedDepth, (0,0), fx= scaleer, fy= scaleer)
                blended = cv2.addWeighted(frameRgb, rgbWeight, alignedDepth, depthWeight, 0)
    
    
                cv2.imshow(rgb_depth_window_name, blended)
    
                ### Ndi Send 1 --> BEGIN
    
                stream_frame_1 = frameRgb
                stream_frame_1 = cv2.cvtColor(stream_frame_1, cv2.COLOR_BGR2BGRA)
    
                video_frame_1.data = stream_frame_1
                video_frame_1.FourCC = ndi.FOURCC_VIDEO_TYPE_BGRX
    
                ndi.send_send_video_v2(ndi_send_1, video_frame_1)
    
                ### NDI Send 1 --> END
    
                ### Ndi Send 2 --> BEGIN
                alignedDepth = cv2.cvtColor(alignedDepth, cv2.COLOR_BGR2BGRA)
    
                video_frame_2.data = alignedDepth
                video_frame_2.FourCC = ndi.FOURCC_VIDEO_TYPE_BGRX
    
                ndi.send_send_video_v2(ndi_send_2, video_frame_2)
    
                ### NDI Send 2 --> END
    
            key = cv2.waitKey(1)
            if key == ord("q"):
                break

    Hi @BonkoKaradjov ,
    Good to hear you had success. Just note that:

    • Video encoding can only be done on color stream atm, as encoding doesn't support depth (as it's int16, and video enc. hardware doesn't support that)
    • Faster switch (eg 2.5/5/10 gbps) won't necessarily help, as camera chip can only do 1gbps, so you'd want to either reduce resolution or FPS to use less bandwidth

    Thanks, Erik

      Dear erik Thank you very much for your help 🙂

      Good to know about the switch. Actually my last script with the setispScale(1,3) works pretty ok
      and I will delete unnecessary code after some fixes I want to add and see how much I can reduce the lag when
      increasing the resolution slightly.

      • Unfortunately THE_1352X1012 does not make the sensor start, so I must reduce from THE_12_MP but it ist still very good 🙂

      • If I got it right video encoding is already running on this script and I am very happy about the progress and the sensor itself.

      • Next step will be to uniform the depthmaps values in one color range (graymap) beginning from start to end, like white to black or the opposite.

      Thank you again for your help and sorry for asking so many questions in the past.
      It took me some time to understand depthai, now things start to get fun.

      16 days later

      Urgent ask for help about the pixel perfect alignement.

      I worked out a Touchdesigner to OaK ToF 12MP (custom
      build from Luxonis) and still am not able to fix the alignement in code between ToF and 12Mp sensor.

      My goal is still to send out 2 streams:

      • ToF with 640x480
      • RGB with 1035x1024
        but aligned together in sync and pixels matching
        (in Touchdesigner I would upscale the ToF image then, so they could match together, so that the network video is nearly realtime without big lags or latency)

      Here is my work in a nutshell until now.
      Link to folder

      My TD Group says that if the sensor cameras had been aligned it should match when sending out the raw data somehow.
      I am trying to recommend your sensors to the group, where lots of artists are searching for sensors.

      My production I would use the sensor starts on january 2nd 2025 and I will for shure have some time to work this issue out for 1 or 2 weeks but I was hoping to start with the arts process also soo, as possible to not run out of time.

      I would be very thankful for any help, still I am not that good in coding with depth ai and more time to understand how to combine different scripts together.

      Thank you very much for any help 🙂

        5 days later

        Hi BonkoKaradjov
        Apologies for late response.
        Does alignment work outside touchdesigner? Best to keep it python only untill everything works as expected.
        I see you are performing alignment on host. The code look fine at first glance, perhaps the intrinsics/extrinsics are incorrect?
        Can you try using ImageAlign node to check the alignment? https://docs.luxonis.com/software/depthai/examples/tof_align/ - example

        Thanks,
        Jaka

          Dear jakaskerl

          thank you very much for your help 🙂

          My approach trying out the alignment outside in Touchdesigner is due to the fact that I am trying since 1-1/2 months to achieve the following:

          (as mentioned above my ToF sensor is a custom build from luxonis with a 12MP Color Camera)

          • RGB Camera Stream through ndi oder syphon (minimum resolution 1352x1024)

          • ToF Camera Stream (if possible 32bit) through ndi oder syphon (mininmum resolution 640x480)

          • both streams should be aligned as perfect as possible and woul be layered one over the other in another software (Touchdesigner) only by scaling up the ToF stream

          • Color and ToF with as less latency as possible and as high quality as possible for the color camera 🙂


          I remember that I had many tryouts with the example you had sent me.
          I tested it out again now and could somehow align the streams by

          #rotate depth
          alignedDepthColorized = cv2.rotate(alignedDepthColorized, cv2.ROTATE_180)
          #transform depth
          M = np.float32([[1, 0, getXOffset], [0, 1, getYOffset]])
          alignedDepthColorized = cv2.warpAffine(alignedDepthColorized, M, (cols, rows))

          but the depth is only aligned in a spot, not if I come near to the camera or go left or right.
          Here a video showing the alignment:
          https://gyazo.com/5ea2deadf471d4a6fb38e890e562f9b5

          Also If alignement works with this script the streams I would get from color and ToF would be only (960x540) so that the reason why I payed lots of money, almost double the price, to get a ToF custom build with 12MP cam would not make sense for me.

          Also I don't know but I remember that with this alignement script sending out a color stream that is higher in resolution somehow introduced unnecessary latency over PoE.

          I was really hoping that I could get an alignment such as the one from the Orbecc Femto Mega between depth and 12MP camera.
          If so the Oak ToF with the 12MP cam would beat the Femto Mega because it would also have a much more flexible Depthmap.

          In my script above my videostreams go pretty fluently through the network almost without latency but I could not achieve sending the ToF video as a 32bit stream through Syphon (only as 16 bit over Ndi).
          Also as in this script, I would implement a camera control over OSC as already implemented to control different situations such as changing focus.

          Here my new script that tries out Syphon:
          https://www.dropbox.com/scl/fi/whzgpadq1mnsb0rf63o3n/RgbTof_newSyphon_WORKING_04_01c.py?rlkey=4n1yvoibmhl0eff9vdigjvt2q&dl=0

          Thank you again very much for your help 🙂