Now that I am able to use the tof sensor on the oak-d-sr-poe camera, I am now able to capture tof depth. I am struggling on the 2nd end of my task. Whenever I view the point cloud in rerun, MeshLab, open3d or any more tool. They are still unaligned, with unwanted artifacts. You can identify the object but it oblivions that it is not well aligned due to the colors and surrounding objects. I am sure this is not the camera fault, as I have calibrated it by following the documentation. ICP and Global registration seems to work, but since there are unwanted voxels or artifacts within the point cloud. The registration has a hard time picking the correct points to stitch together. I have tried other methods. Any suggestions would be helpful. The PCD's primarily look good, I just want to objects to be identifiable. Meaning that the objects are fully formed.

@"gdeanrexroth"#p26257 How are you aligning the pointclouds? If using o3d, what settings are you applying to refine registration? Thanks, Jaka

@"jakaskerl"#p26265 Yes I am using open3d settings to align my point cloud. Here is an example of how I am applying it. import open3d as o3d import numpy as np import copy def draw_registration_result(source, target, transformation): ``` source_temp = copy.deepcopy(source) target_temp = copy.deepcopy(target) source_temp.paint_uniform_color([1, 0.706, 0]) # Yellow target_temp.paint_uniform_color([0, 0.651, 0.929]) # Blue source_temp.transform(transformation) o3d.visualization.draw_geometries([source_temp, target_temp], zoom=0.4559, front=[0.6452, -0.3036, -0.7011], lookat=[1.9892, 2.0208, 1.8945], up=[-0.2779, -0.9482, 0.1556]) ``` \# Load your point clouds source = o3d.io.read_point_cloud(r'*file_path_to_pcd_1.pcd*') target = o3d.io.read_point*cloud(r'file_path_to_pcd_2.pcd*') \# Visualize the initial alignment print("Initial alignment") draw_registration_result(source, target, np.identity(4)) \# Perform ICP registration threshold = 0.07 # Distance threshold, you may need to adjust this trans_init = np.identity(4) # Initial transformation reg_p2p = o3d.pipelines.registration.registration_icp( ``` source, target, threshold, trans_init, o3d.pipelines.registration.TransformationEstimationPointToPoint()) ``` print("Point-to-point ICP registration result:") print(reg_p2p) print("Transformation is:") print(reg_p2p.transformation) \# Visualize the result after ICP print("ICP result") draw_registration_result(source, target, reg_p2p.transformation) **I am using open3d with global and icp registration as well. I ran the both of my pcds in meshlab as well but the result was the same. Most of the items I am capturing do have a glare or shine in it. Based off some research I see that glare, plastic and some items that are reflective may conflict with my alignment of my point cloud. Correct me if I am wrong. Thanks :)**

@"gdeanrexroth"#p26270 for example this a screenshot of a point cloud that i cropped with using open3d. The unaligned points/voxels are evident in the screenshot. ICP and Global both struggle to refine or align two pcds because of the unalignment(my assumption). From above/straight view, everything looks on the point cloud. However as we see in the other screenshot, the sides are wrapped with the appropriate skin of the bottle. Again I have two scripts that does the work here. I have a script that captures the tof depth and raw color feed from the camera that does some calculations and convert the images into a .pcd or .ply file. Then I have the second script that is reading in the file path of both point clouds to do pre processing work along with alignment. [upl-image-preview url=https://discuss.luxonis.com/assets/files/2024-10-03/1727978320-22979-screenshot-2024-10-03-135809.png] [upl-image-preview url=https://discuss.luxonis.com/assets/files/2024-10-03/1727978320-140489-screenshot-2024-10-03-135733.png]

[ToF Depth Perception (luxonis.com)](https://docs.luxonis.com/hardware/platform/depth/ToF-depth/) As this article states the potential tof issues with reflective surfaces/transparent surfaces and the multiple path reflections. Would those issues affect alignment or registration? Most, if not all the material I am capturing will have a glare on it. But i am trying to create an environment that doesn't allow too much light where the light bounces off the material and the surface that it is on. However when I took depth of the a object with no glare, such as cardboard box(like how you guys did the box measurement example). The box point cloud looked very good. So I am curious if this is my main issue.

@"gdeanrexroth"#p26298 Yes, the reflective and transparent surfaces are indeed a problem for ToF. > @"gdeanrexroth"#p26298 But i am trying to create an environment that doesn't allow too much light where the light bounces off the material and the surface that it is on. That doesn't help. ToF emits it's own light. The spray bottle you are imaging looks like it has problems due to glare and transparency. if you try to wrap it in something non reflective, does it help? Thanks, Jaka

@"jakaskerl"#p26311 Yes, non reflective objects such as the cardboard box comes out pretty well. [upl-image-preview url=https://discuss.luxonis.com/assets/files/2024-10-04/1728071979-908783-screenshot-2024-10-04-155544.png] See the example above. Still a little bit off but it is formed and also defined. Most of all the points are aligned, except for a few. I am unable to wrap the bottle, so i went with the next available thing to me. Now that I know about the issues that reflective objects/surfaces cause. Most of the material/objects will have a glare or some sort of reflection. Any suggestions on how I can bypass this. I will continue to use refinement and other alignment methods on non reflective objects to see if it combines accurately.

@"gdeanrexroth"#p26315 Can't say, we are still figuring it out ourselves.. The refinement will need to be done on our side as well since the borders of the pointcloud are sometimes off by a few pixels (can be seen in your image as well).. Thanks, Jaka

@"jakaskerl"#p26325 Thank you for response. With that being said, I assume the Luxonis team will work on solving this issue. But as for me, my project with the camera ends this month. I am sure that whenever this issue is fixed by you guys, you will update us. Does this affect the alignment issues that I have going on right now? Can I also assume that ICP and Global registration would not work in my case due to the material I capturing or it will work with some issues?

> @"gdeanrexroth"#p26351 Does this affect the alignment issues that I have going on right now? [upl-image-preview url=https://discuss.luxonis.com/assets/files/2024-10-08/1728379797-265869-image.png] These are the issues (borders are not correctly aligned. > @"gdeanrexroth"#p26351 Can I also assume that ICP and Global registration would not work in my case due to the material I capturing or it will work with some issues? The issue is that ICP and global registration use distances between pointclouds to compute the fused pointcloud. As far as I understand you would need custom processing that would allow you to give larger weight to depth pointcloud if there is high reflectivity since stereo generally works better in those circumstances and is more accurate. Thanks Jaka

@"jakaskerl"#p26361 Okay then I will try stereo depth to see if i can get better results.

Point Cloud Alignment

jakaskerl

apirrone Likely with depthai V3

gdeanrexroth Yup.

gdeanrexroth Does the intrinsic or extrinsic value need to be in the script that captures the point cloud(pcd or ply file)? Or does it need to be implemented into the script that does the icp or global registration?

In the pointcloud generation. The idea is to create a colorized pointcloud -- depth needs to be aligned to RGB. This can only be done by knowing the extrinsics from RGB to stereo, and intrinsics of rgb and the rectified stereo frames.

Thanks,
Jaka

gdeanrexroth

jakaskerl

Nice, thank you for the update. Overall until you guys fix the tof-rgb alignment and release the depthai V3. I can assume that I will not be able to make any more progress on my current project utilizing the ToF senor to generate point cloud. Until the Luxonis team fixes this issue, correct? I've tried stereo depth to generate point cloud but it gave me similar results in regard of misaligned point cloud, but I will try again.

I believe that you placed a stereo depth point cloud in one the discussion threads that I have created. It does works however it generates the point cloud in a narrow shape. So I am still reading documentation on the different settings and parameters that are appropriate for me use case. Most of my objects will be 20cm to 50 cm away from the camera.

apirrone

jakaskerl Likely with depthai V3

Ok, and any timeline on the release of this version ?

I'm a little confused. If there is a bug in the ImageAlign node in firmware, how did you get such a perfectly aligned point cloud here ?

jakaskerl

gdeanrexroth

gdeanrexroth Overall until you guys fix the tof-rgb alignment and release the depthai V3. I can assume that I will not be able to make any more progress on my current project utilizing the ToF senor to generate point cloud. Until the Luxonis team fixes this issue, correct?

The new script I sent aims to fix this issue. We will probably just change the example (not sure what the current idea is for depthai). The only thing different seems to be the RGB undistortion. You can continue to develop the project just make sure you undistort the RGB camera.

apirrone I'm a little confused. If there is a bug in the ImageAlign node in firmware, how did you get such a perfectly aligned point cloud here ?

This was done here iirc.

Thanks,
Jaka

gdeanrexroth

jakaskerl
So with the file that you sent me, I was able to run it. And it successfully displayed the tof and rgb color camera window. I then processed to apply a method for me to save both rgb-color camera and tof depth. Below are the examples. there is still misalignment(which i assume is expected right now), but it does seem to pick up objects that are behind my purple bottle. Along with the space between the back wall and the front of the cardboard box

gdeanrexroth

gdeanrexroth The correct angle displays perfect alignment, but as a i rotate the .ply file you can see the gaps and holes. Even with pre and post processing filtering, it winds up the same. Better but still misaligned. I can and will continue my project however it heavily relies on displaying align material and parts point cloud. That's a huge part of the project with the camera. The expected results should look like the ToF Demo video.

@jakaskerl Here is somewhat of a better example.

jakaskerl

gdeanrexroth The correct angle displays perfect alignment, but as a i rotate the .ply file you can see the gaps and holes.

That is expected and is a result of occlusion.. This can't be fixed at all since information is missing.

gdeanrexroth So I am still unsure if I can make much more progress on my side since the issue is being fixed internally by you and your team.

The only fix on our part is include the RGB undistortion that was missed in the original release..

Thanks,
Jaka

gdeanrexroth

@jakaskerl
More examples of the new script point cloud, some improvements are noticeable. But from the side again everything is not fully aligned.

gdeanrexroth

@jakaskerl
With the pictures I attached above somewhat explains my side of the project as well. Without the fully, formed point cloud my team and I will not be able to view our material/parts in a 3d(point cloud) format. This results are with me undistorting the RGB camera. So I am still unsure if I can make much more progress on my side since the issue is being fixed internally by you and your team.

apirrone

jakaskerl

Undistorting the rgb doesn't help for me

Without rgb undistort :

With rgb undistort:

It's probably because I'm using fisheye lenses.
Using cv2.fisheye.initUndistortRectifyMap(...) doesn't help

jakaskerl

apirrone
Yes, most likely. The image is still not undistorted properly. Not sure if this was even tested.

cc @CenekAlbl

Thanks,
Jaka

gdeanrexroth

jakaskerl
With undistorting not working properly are there any steps you recommend us to take? Or do we have to wait until depthai v3?

I doubt that this is my issue but can incorrect calibration of misalignment? I have tried undistorting but it seemed to not work. I am new to the computer vision world so I am unaware of few terms. This is my result of using undistorting

jakaskerl

gdeanrexroth With undistorting not working properly are there any steps you recommend us to take? Or do we have to wait until depthai v3?

Should be working properly, just on on fisheye lenses.

gdeanrexroth This is my result of using undistorting

Did you use the ImageAlign node to align the tof and RGB? The RGB looks ok at first glance, so either extrinsics/intrinsics are incorrect or there is a mistake in aligning.

Thanks,
Jaka

gdeanrexroth

jakaskerl
Yes I did use the image aligned node within my script. I used the tof-rgb alignment script that you provided, I added a key function that will allow me to capture and save the tof-rgb depth as a ply file. There are other ways that I can approach this however, I simply wanted to see how the rgb would look.
import numpy as np

import cv2

import depthai as dai

import time

from datetime import timedelta

# This example is intended to run unchanged on an OAK-D-SR-PoE camera

FPS = 30.0

RGB_SOCKET = dai.CameraBoardSocket.CAM_C

TOF_SOCKET = dai.CameraBoardSocket.CAM_A

ALIGN_SOCKET = RGB_SOCKET

# Define intrinsic parameters

RGB_INTRINSICS = np.array([

[494.3519287109375, 0.0, 321.8478088378906],

[0.0, 499.4835205078125, 258.3044128417969],

[0.0, 0.0, 1.0]

])

TOF_INTRINSICS = np.array([

[842.6837768554688, 0.0, 673.1340942382812],

[0.0, 851.867431640625, 412.4818115234375],

[0.0, 0.0, 1.0]

])

# Define camera resolutions

RGB_WIDTH, RGB_HEIGHT = 640, 480

TOF_WIDTH, TOF_HEIGHT = 1280, 800

class FPSCounter:

def __init__(self):

    self.frameTimes = []

def tick(self):

    # record the current time for the FPS calculation

    now = time.time()

    self.frameTimes.append(now)

    self.frameTimes = self.frameTimes[-100:]

def getFps(self):

    if len(self.frameTimes) <= 1:

        return 0

    # Calculate the FPS based on the recorded frame times

    return (len(self.frameTimes) - 1) / (self.frameTimes[-1] - self.frameTimes[0])

def save_ply(rgb_image, depth_image, filename):

# Get dimensions of the RGB image

height, width, _ = rgb_image.shape

# Save as PLY file

with open(filename, 'w') as ply_file:

    #write ply header

    ply_file.write('ply\\n')

    ply_file.write('format ascii 1.0\\n')

    ply_file.write(f'element vertex {height \* width}\\n')

    ply_file.write('property float x\\n')

    ply_file.write('property float y\\n')

    ply_file.write('property float z\\n')

    ply_file.write('property uchar red\\n')

    ply_file.write('property uchar green\\n')

    ply_file.write('property uchar blue\\n')

    ply_file.write('end_header\\n')

    

    # Convert depth image to point cloud using ToF intrinsics

    fx = TOF_INTRINSICS[0, 0]

    fy = TOF_INTRINSICS[1, 1]

    cx = TOF_INTRINSICS[0, 2]

    cy = TOF_INTRINSICS[1, 2]

    

    for v in range(height):

        for u in range(width):

            z = depth_image[v, u]  # Depth value

            if z > 0:  # Ignore zero 

                # calculate 3d coordinates

                x = (u - cx) \* z / fx

                y = (v - cy) \* z / fy

                r, g, b = rgb_image[v, u]  # RGB color

                #write the vertex data to PLY file

                ply_file.write(f'{x} {y} {z} {r} {g} {b}\\n')

pipeline = dai.Pipeline()

# Define sources and outputs

camRgb = pipeline.create(dai.node.ColorCamera)

tof = pipeline.create(dai.node.ToF)

camTof = pipeline.create(dai.node.Camera)

sync = pipeline.create(dai.node.Sync)

align = pipeline.create(dai.node.ImageAlign)

out = pipeline.create(dai.node.XLinkOut)

# ToF settings

camTof.setFps(FPS)

camTof.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)

camTof.setBoardSocket(TOF_SOCKET)

# rgb settings

camRgb.setBoardSocket(RGB_SOCKET)

camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)

camRgb.setFps(FPS)

camRgb.setIspScale(1, 1)

# defines image sizes

depthSize = (TOF_WIDTH, TOF_HEIGHT)

rgbSize = camRgb.getIspSize()

# set output stream name

out.setStreamName("out")

# configure synchronization threshold

sync.setSyncThreshold(timedelta(seconds=0.5 / FPS))

# Linking

camRgb.isp.link(sync.inputs["rgb"]) #link RGB camera output to sync node

camTof.raw.link(tof.input) #link TOF camera raw output to TOF node

tof.depth.link(align.input) #link TOF depth output to align node

align.outputAligned.link(sync.inputs["depth_aligned"]) #link aligned depth to sync node

sync.inputs["rgb"].setBlocking(False) #set rgb input as non-blocking

camRgb.isp.link(align.inputAlignTo) #link sync output to XLinkOut node

sync.out.link(out.input)

def colorizeDepth(frameDepth):

invalidMask = frameDepth == 0

try:

    minDepth = np.percentile(frameDepth[frameDepth != 0], 3)

    maxDepth = np.percentile(frameDepth[frameDepth != 0], 95)

    logDepth = np.log(frameDepth, where=frameDepth != 0)

    logMinDepth = np.log(minDepth)

    logMaxDepth = np.log(maxDepth)

    np.nan_to_num(logDepth, copy=False, nan=logMinDepth)

    logDepth = np.clip(logDepth, logMinDepth, logMaxDepth)

    depthFrameColor = np.interp(logDepth, (logMinDepth, logMaxDepth), (0, 255))

    depthFrameColor = np.nan_to_num(depthFrameColor)

    depthFrameColor = depthFrameColor.astype(np.uint8)

    depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)

    depthFrameColor[invalidMask] = 0

except IndexError:

    depthFrameColor = np.zeros((frameDepth.shape[0], frameDepth.shape[1], 3), dtype=np.uint8)

except Exception as e:

    raise e

return depthFrameColor

rgbWeight = 0.4

depthWeight = 0.6

def updateBlendWeights(percentRgb):

global depthWeight

global rgbWeight

rgbWeight = float(percentRgb) / 100.0

depthWeight = 1.0 - rgbWeight

# Connect to device and start pipeline

remapping = True

save_ply_flag = False

with dai.Device(pipeline) as device:

queue = device.getOutputQueue("out", 8, False)

rgbDepthWindowName = "rgb-depth"

cv2.namedWindow(rgbDepthWindowName)

cv2.createTrackbar(

    "RGB Weight %",

    rgbDepthWindowName,

    int(rgbWeight \* 100),

    100,

    updateBlendWeights,

)



try:

    calibData = device.readCalibration2()

    M1 = np.array(calibData.getCameraIntrinsics(ALIGN_SOCKET, \*depthSize))

    D1 = np.array(calibData.getDistortionCoefficients(ALIGN_SOCKET))

    M2 = np.array(calibData.getCameraIntrinsics(RGB_SOCKET, \*rgbSize))

    D2 = np.array(calibData.getDistortionCoefficients(RGB_SOCKET))

    # Use the predefined intrinsics if available, otherwise use the calibration data

    if TOF_INTRINSICS is not None:

        M1 = TOF_INTRINSICS

    if RGB_INTRINSICS is not None:

        M2 = RGB_INTRINSICS

    try:

        T = np.array(calibData.getCameraTranslationVector(ALIGN_SOCKET, RGB_SOCKET, False)) \* 10

    except RuntimeError:

        T = np.array([0.0, 0.0, 0.001])

    try:

        R = np.array(calibData.getCameraExtrinsics(ALIGN_SOCKET, RGB_SOCKET, False))[0:3, 0:3]

    except RuntimeError:

        R = np.eye(3)

    TARGET_MATRIX = M1

    lensPosition = calibData.getLensPosition(RGB_SOCKET)

except:

    raise



fpsCounter = FPSCounter()

while True:

    messageGroup = queue.get()

    fpsCounter.tick()

    assert isinstance(messageGroup, dai.MessageGroup)

    frameRgb = messageGroup["rgb"]

    assert isinstance(frameRgb, dai.ImgFrame)

    frameDepth = messageGroup["depth_aligned"]

    assert isinstance(frameDepth, dai.ImgFrame)

    sizeRgb = frameRgb.getData().size

    sizeDepth = frameDepth.getData().size

    

    if frameDepth is not None:

        rgbFrame = frameRgb.getCvFrame()

        alignedDepthColorized = colorizeDepth(frameDepth.getFrame())

        cv2.putText(

            alignedDepthColorized,

            f"FPS: {fpsCounter.getFps():.2f}",

            (10, 30),

            cv2.FONT_HERSHEY_SIMPLEX,

            1,

            (255, 255, 255),

            2,

        )

        cv2.imshow("depth", alignedDepthColorized)

        key = cv2.waitKey(1)

        if key == ord("m"):

            remapping = not remapping

            print(f"Remap turned {'ON' if remapping else 'OFF'}.")

        elif key == ord('s'):

            save_ply_flag = True

        if remapping:

            mapX, mapY = cv2.initUndistortRectifyMap(

                M2, D2, None, M2, rgbSize, cv2.CV_32FC1

            )

            rgbFrame = cv2.remap(rgbFrame, mapX, mapY, cv2.INTER_LINEAR)

        blended = cv2.addWeighted(

            rgbFrame, rgbWeight, alignedDepthColorized, depthWeight, 0

        )

        cv2.imshow(rgbDepthWindowName, blended)

    if save_ply_flag:

        rgb_frame = cv2.cvtColor(rgbFrame, cv2.COLOR_BGR2RGB)

        save_ply(rgb_frame, frameDepth.getFrame(), 'george_rgb1.ply')

        print("PLY file saved as 'george_rgb.ply'")

        save_ply_flag = False

    if key == ord("q"):

        break

cv2.destroyAllWindows()

I ran this script :https://docs.luxonis.com/software/depthai/examples/calibration_reader/
Here are the results from it, however if its extrinsic. Then I may have to recalibrate the camera with the method you recommended to me :
RGB Camera Default intrinsics...

[[494.3519287109375, 0.0, 321.8478088378906], [0.0, 499.4835205078125, 258.3044128417969], [0.0, 0.0, 1.0]]

640

480

RGB Camera Default intrinsics...

[[494.3519287109375, 0.0, 321.8478088378906], [0.0, 499.4835205078125, 258.3044128417969], [0.0, 0.0, 1.0]]

640

480

RGB Camera resized intrinsics... 3840 x 2160

[[2.96611157e+03 0.00000000e+00 1.93108691e+03]

[0.00000000e+00 2.99690112e+03 1.18982642e+03]

[0.00000000e+00 0.00000000e+00 1.00000000e+00]]

RGB Camera resized intrinsics... 4056 x 3040

[[3.13295532e+03 0.00000000e+00 2.03971057e+03]

[0.00000000e+00 3.16547681e+03 1.63600427e+03]

[0.00000000e+00 0.00000000e+00 1.00000000e+00]]

LEFT Camera Default intrinsics...

[[842.6837768554688, 0.0, 673.1340942382812], [0.0, 851.867431640625, 412.4818115234375], [0.0, 0.0, 1.0]]

1280

800

LEFT Camera resized intrinsics... 1280 x 720

[[842.68377686 0. 673.13409424]

[ 0. 851.86743164 372.48181152]

[ 0. 0. 1. ]]

RIGHT Camera resized intrinsics... 1280 x 720

[[836.24615479 0. 656.42828369]

[ 0. 845.62658691 399.05911255]

[ 0. 0. 1. ]]

LEFT Distortion Coefficients...

k1: -9.116254806518555

k2: 262.5550842285156

p1: 0.007134947460144758

p2: -0.0009857615223154426

k3: 1347.274169921875

k4: -9.195904731750488

k5: 260.98687744140625

k6: 1308.9786376953125

s1: 0.0

s2: 0.0

s3: 0.0

s4: 0.0

τx: 0.0

τy: 0.0

RIGHT Distortion Coefficients...

k1: -5.861973762512207

k2: 5.3061065673828125

p1: 0.005871884059160948

p2: 0.00142634566873312

k3: 88.46317291259766

k4: -6.072614669799805

k5: 7.037742614746094

k6: 83.25321960449219

s1: 0.0

s2: 0.0

s3: 0.0

s4: 0.0

τx: 0.0

τy: 0.0

RGB FOV 71.86000061035156, Mono FOV 71.86000061035156

LEFT Camera stereo rectification matrix...

[[ 9.94211665e-01 8.86633717e-03 -1.81476751e+01]

[-4.90145546e-03 9.94452611e-01 2.86947005e+01]

[ 2.85166444e-06 4.52051103e-06 9.96386328e-01]]

RIGHT Camera stereo rectification matrix...

[[ 1.00186534e+00 8.93177200e-03 -6.82440986e+00]

[-4.93918803e-03 1.00179181e+00 -7.21055399e-01]

[ 2.87361723e-06 4.55387305e-06 9.96286100e-01]]

Transformation matrix of where left Camera is W.R.T right Camera's optical center

[[ 9.99597728e-01 -7.52320397e-04 -2.83513945e-02 -3.98795390e+00]

[ 5.31902653e-04 9.99969602e-01 -7.78123224e-03 -2.61692833e-02]

[ 2.83563845e-02 7.76302209e-03 9.99567747e-01 -1.03633568e-01]

[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

Transformation matrix of where left Camera is W.R.T RGB Camera's optical center

[[ 9.99843597e-01 -8.23507272e-03 1.56521089e-02 -7.51402760e+00]

[ 8.25367495e-03 9.99965310e-01 -1.12427305e-03 -1.49354547e-01]

[-1.56423096e-02 1.25328498e-03 9.99876857e-01 4.59043831e-01]

[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

gdeanrexroth

@jakaskerl
Alignment is off some and this with the updated script that you provided me a few days ago.

gdeanrexroth

@jakaskerl

Screenshots of the same point cloud but from different angles. This is using the updated script that you gave me, along with the using the image align node.

jakaskerl

gdeanrexroth
Try this:

import os
import time
import json
import cv2
import depthai as dai
import numpy as np
from datetime import timedelta
print(dai.__version__)
from numba import jit, prange

@jit(nopython=True, parallel=True)
def reprojection(depth_image, depth_camera_intrinsics, camera_extrinsics, color_camera_intrinsics, depth_image_show = None):
    height = len(depth_image)
    width = len(depth_image[0])
    if depth_image_show is not None:
        image = np.zeros((height, width), np.uint8)
    else:
        image = np.zeros((height, width), np.uint16)
    if(camera_extrinsics[0][3] > 0):
        sign = 1
    else:
        sign = -1
    for i in prange(0, height):
        for j in prange(0, width):
            if sign == 1:
                # Reverse the order of the pixels
                j = width - j - 1
            d = depth_image[i][j]
            if(d == 0):
                continue
            # Convert pixel to 3d point
            x = (j - depth_camera_intrinsics[0][2]) * d / depth_camera_intrinsics[0][0]
            y = (i - depth_camera_intrinsics[1][2]) * d / depth_camera_intrinsics[1][1]
            z = d

            # Move the point to the camera frame
            x1 = camera_extrinsics[0][0] * x + camera_extrinsics[0][1] * y + camera_extrinsics[0][2] * z + camera_extrinsics[0][3]
            y1 = camera_extrinsics[1][0] * x + camera_extrinsics[1][1] * y + camera_extrinsics[1][2] * z + camera_extrinsics[1][3]
            z1 = camera_extrinsics[2][0] * x + camera_extrinsics[2][1] * y + camera_extrinsics[2][2] * z + camera_extrinsics[2][3]

            u = color_camera_intrinsics[0][0] * (x1  / z1) + color_camera_intrinsics[0][2]
            v = color_camera_intrinsics[1][1] * (y1  / z1) + color_camera_intrinsics[1][2]
            int_u = round(u)
            int_v = round(v)
            if(int_v != i):
                print(f'v -> {v} and i -> {i}') # This should never be printed
            if int_u >= 0 and int_u < (len(image[0]) - 1) and int_v >= 0 and int_v < len(image):
                if depth_image_show is not None:
                    image[int_v][int_u] = depth_image_show[i][j][0]
                    image[int_v][int_u + sign] = depth_image_show[i][j][0]
                else:
                    image[int_v][int_u] = z1
                    image[int_v][int_u + sign] = z1
    return image

def create_pipeline(ALIGN_SOCKET):
    pipeline = dai.Pipeline()

    # Create ToF node
    tof = pipeline.create(dai.node.ToF)
    tof.setNumShaves(4)

    # Configure the ToF node
    tofConfig = tof.initialConfig.get()
    tofConfig.enableFPPNCorrection = True
    tofConfig.enableOpticalCorrection = True
    tofConfig.enableWiggleCorrection = True
    tofConfig.enableTemperatureCorrection = True
    tofConfig.phaseUnwrappingLevel = 4
    tof.initialConfig.set(tofConfig)

    # Input for ToF configuration
    xinTofConfig = pipeline.create(dai.node.XLinkIn)
    xinTofConfig.setStreamName("tofConfig")
    xinTofConfig.out.link(tof.inputConfig)

    # Create ToF camera node
    cam_tof = pipeline.create(dai.node.Camera)
    cam_tof.setFps(20)
    cam_tof.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
    cam_tof.setBoardSocket(dai.CameraBoardSocket.CAM_A)
    cam_tof.raw.link(tof.input)

    # Create ColorCamera nodes for stereo pair
    colorLeft = pipeline.create(dai.node.ColorCamera)
    colorRight = pipeline.create(dai.node.ColorCamera)


    # Configure ColorCameras
    colorLeft.setBoardSocket(dai.CameraBoardSocket.CAM_B)
    colorRight.setBoardSocket(dai.CameraBoardSocket.CAM_C)
    colorLeft.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)
    colorRight.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)
    colorLeft.setFps(20)
    colorRight.setFps(20)
    colorLeft.setInterleaved(False)
    colorRight.setInterleaved(False)
    colorLeft.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
    colorRight.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
    #colorLeft.setImageOrientation(dai.CameraImageOrientation.NORMAL)
    #colorRight.setImageOrientation(dai.CameraImageOrientation.NORMAL)

    #colorLeft.setIspScale(2, 2)  # Corrected line
    #colorRight.setIspScale(2, 1)  # Corrected line


    # Create StereoDepth node
    stereo = pipeline.create(dai.node.StereoDepth)
    stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
    stereo.initialConfig.setMedianFilter(dai.MedianFilter.MEDIAN_OFF)
    stereo.setLeftRightCheck(True)
    stereo.setExtendedDisparity(False)
    stereo.setSubpixel(False)
    stereo.setDepthAlign(ALIGN_SOCKET)

    # Link the RAW outputs of the ColorCameras to the StereoDepth node
    colorLeft.isp.link(stereo.left)
    colorRight.isp.link(stereo.right)


    # Create Sync node
    sync = pipeline.create(dai.node.Sync)
    sync.setSyncThreshold(timedelta(milliseconds=50))

    # Link outputs to Sync node with specified input names
    tof.depth.link(sync.inputs["depth_tof"])
    stereo.depth.link(sync.inputs["depth_stereo"])
    stereo.rectifiedLeft.link(sync.inputs["left_img"])
    stereo.rectifiedRight.link(sync.inputs["right_img"])
    colorLeft.isp.link(sync.inputs["rgb_img"])  # Corrected line

    # Create XLinkOut node
    xout = pipeline.create(dai.node.XLinkOut)
    xout.setStreamName("sync_out")
    sync.out.link(xout.input)

    return pipeline

def get_calib(RGB_SOCKET, ALIGN_SOCKET, depthSize, rgbSize):
    calibData = device.readCalibration2()
    M1 = np.array(calibData.getCameraIntrinsics(ALIGN_SOCKET, *depthSize))
    D1 = np.array(calibData.getDistortionCoefficients(ALIGN_SOCKET))
    M2 = np.array(calibData.getCameraIntrinsics(RGB_SOCKET, *rgbSize))
    D2 = np.array(calibData.getDistortionCoefficients(RGB_SOCKET))
    try:
        T = (
            np.array(calibData.getCameraTranslationVector(ALIGN_SOCKET, RGB_SOCKET, False))
            * 10
        )  # to mm for matching the depth
    except:
        T = np.array([0.0, 0.0, 0.001])
    try:
        R = np.array(calibData.getCameraExtrinsics(ALIGN_SOCKET, RGB_SOCKET, False))[
        0:3, 0:3
        ]
    except:
        R = np.eye(3)
    TARGET_MATRIX = M1
    lensPosition = calibData.getLensPosition(RGB_SOCKET)
    return M1, D1, M2, D2, T, R, TARGET_MATRIX

def getAlignedDepth(frameDepth, M1, D1, M2, D2, T, R, TARGET_MATRIX, depthSize,rgbSize):
    R1, R2, _, _, _, _, _ = cv2.stereoRectify(M1, D1, M2, D2, (100, 100), R, T)  # The (100,100) doesn't matter as it is not used for calculating the rotation matrices
    leftMapX, leftMapY = cv2.initUndistortRectifyMap(M1, None, R1, TARGET_MATRIX, depthSize, cv2.CV_32FC1)
    depthRect = cv2.remap(frameDepth, leftMapX, leftMapY, cv2.INTER_NEAREST)
    newR = np.dot(R2, np.dot(R, R1.T))  # Should be very close to identity
    newT = np.dot(R2, T)
    combinedExtrinsics = np.eye(4)
    combinedExtrinsics[0:3, 0:3] = newR
    combinedExtrinsics[0:3, 3] = newT
    depthAligned = reprojection(depthRect, TARGET_MATRIX, combinedExtrinsics, TARGET_MATRIX)
    # Rotate the depth to the RGB frame
    R_back = R2.T
    mapX, mapY = cv2.initUndistortRectifyMap(TARGET_MATRIX, None, R_back, M2, rgbSize, cv2.CV_32FC1)
    outputAligned = cv2.remap(depthAligned, mapX, mapY, cv2.INTER_NEAREST)
    return outputAligned

MIN_DEPTH = 500  # mm
MAX_DEPTH = 10000  # mm
def colorizeDepth(frameDepth, minDepth=MIN_DEPTH, maxDepth=MAX_DEPTH):
    invalidMask = frameDepth == 0
    # Log the depth, minDepth and maxDepth
    logDepth = np.log(frameDepth, where=frameDepth != 0)
    logMinDepth = np.log(minDepth)
    logMaxDepth = np.log(maxDepth)
    depthFrameColor = np.interp(logDepth, (logMinDepth, logMaxDepth), (0, 255)).astype(
        np.uint8
    )
    depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)
    # Set invalid depth pixels to black
    depthFrameColor[invalidMask] = 0
    return depthFrameColor

RGB_SOCKET = dai.CameraBoardSocket.RGB
TOF_SOCKET = dai.CameraBoardSocket.CAM_A
LEFT_SOCKET = dai.CameraBoardSocket.LEFT
RIGHT_SOCKET = dai.CameraBoardSocket.RIGHT
ALIGN_SOCKET = RIGHT_SOCKET

COLOR_RESOLUTION = dai.ColorCameraProperties.SensorResolution.THE_1080_P
LEFT_RIGHT_RESOLUTION = dai.MonoCameraProperties.SensorResolution.THE_800_P
toFSize =  (640, 480)
rgbSize = (1280, 800)
rgbWeight = 0.4
depthWeight = 0.6

def updateBlendWeights(percent_rgb):
    """
    Update the rgb and depth weights used to blend depth/rgb image

    @param[in] percent_rgb The rgb weight expressed as a percentage (0..100)
    """
    global depthWeight
    global rgbWeight
    rgbWeight = float(percent_rgb) / 100.0
    depthWeight = 1.0 - rgbWeight

if __name__ == '__main__':
    pipeline = create_pipeline(ALIGN_SOCKET)
    rgb_depth_window_name = "rgb-depth"

    with dai.Device(pipeline) as device:
        cv2.namedWindow(rgb_depth_window_name)
        cv2.createTrackbar(
        "RGB Weight %",
        rgb_depth_window_name,
        int(rgbWeight * 100),
        100,
        updateBlendWeights,
        )
        # Create output queue
        q_sync = device.getOutputQueue(name="sync_out", maxSize=4, blocking=False)
        try:
            M1, D1, M2, D2, T, R, TARGET_MATRIX = get_calib(RGB_SOCKET, ALIGN_SOCKET, toFSize, rgbSize)
        except:
            raise
        # Read calibration data
        calibration_handler = device.readCalibration()
        camera_names = {
            dai.CameraBoardSocket.CAM_B: 'cam_b',
            dai.CameraBoardSocket.CAM_C: 'cam_c',
            dai.CameraBoardSocket.CAM_A: 'tof'
        }

        # Create timestamped directory
        timestamp = time.strftime("%Y%m%d_%H%M%S")
        save_dir = os.path.join("data", timestamp)
        os.makedirs(save_dir, exist_ok=True)

        extrinsics_coeffs_tof_cam_b = calibration_handler.getCameraExtrinsics(dai.CameraBoardSocket.CAM_A, dai.CameraBoardSocket.CAM_B)
        extrinsics_coeffs_cam_b_tof = calibration_handler.getCameraExtrinsics(dai.CameraBoardSocket.CAM_B, dai.CameraBoardSocket.CAM_A)
        extrinsics_coeffs_tof_cam_c = calibration_handler.getCameraExtrinsics(dai.CameraBoardSocket.CAM_A, dai.CameraBoardSocket.CAM_C)

        # Save calibration data into the folder
        calib_data = {}
        for camera_socket, camera_name in camera_names.items():
            intrinsics = calibration_handler.getCameraIntrinsics(camera_socket)
            dist_coeffs = calibration_handler.getDistortionCoefficients(camera_socket)
            calib_data[camera_name] = {
                'intrinsics': intrinsics,
                'distortion_coefficients': dist_coeffs
            }

        calib_data['extrinsics'] = {
            'tof_cam_b': extrinsics_coeffs_tof_cam_b,
            'cam_b_tof': extrinsics_coeffs_cam_b_tof,
            'tof_cam_c': extrinsics_coeffs_tof_cam_c

        }
        calibration_file = os.path.join(save_dir, "calibration.json")
        with open(calibration_file, 'w') as f:
            json.dump(calib_data, f, indent=4)

        frame_counter = 0
        while True:
            # Get synchronized messages
            msgGrp = q_sync.get()

            frames = {}
            for name, msg in msgGrp:
                frames[name] = msg.getCvFrame()

            if len(frames) == 5:
                # Process the frames
                depth_tof_frame = frames['depth_tof']
                depth_stereo_frame = frames['depth_stereo']
                left_frame = frames['left_img']
                right_frame = frames['right_img']
                rgb_frame = frames['rgb_img']

                # Save the frames
                depth_tof_filename = os.path.join(save_dir, f"depth_tof_{frame_counter:06d}.npy")
                depth_stereo_filename = os.path.join(save_dir, f"depth_stereo_{frame_counter:06d}.npy")
                left_filename = os.path.join(save_dir, f"left_img_{frame_counter:06d}.png")
                right_filename = os.path.join(save_dir, f"right_img_{frame_counter:06d}.png")
                rgb_filename = os.path.join(save_dir, f"rgb_img_{frame_counter:06d}.png")

                np.save(depth_tof_filename, depth_tof_frame)
                np.save(depth_stereo_filename, depth_stereo_frame)
                cv2.imwrite(left_filename, left_frame)
                cv2.imwrite(right_filename, right_frame)
                cv2.imwrite(rgb_filename, rgb_frame)

                # Optional: Display the images and depth maps
                # Normalize and colorize depth maps for visualization
                depth_tof_display = cv2.normalize(depth_tof_frame, None, 0, 255, cv2.NORM_MINMAX)
                depth_tof_display = np.uint8(depth_tof_display)
                depth_tof_display = cv2.applyColorMap(depth_tof_display, cv2.COLORMAP_JET)
                cv2.imshow("Depth ToF", depth_tof_display)

                depth_stereo_display = cv2.normalize(depth_stereo_frame, None, 0, 255, cv2.NORM_MINMAX)
                depth_stereo_display = np.uint8(depth_stereo_display)
                depth_stereo_display = cv2.applyColorMap(depth_stereo_display, cv2.COLORMAP_JET)
                cv2.imshow("Depth Stereo", depth_stereo_display)

                cv2.imshow("Left Image", left_frame)
                cv2.imshow("Right Image", right_frame)
                cv2.imshow("RGB Image", rgb_frame)
                h, w = depth_stereo_frame.shape[:2]
                M1_r, D1_r, M2_r, D2_r, T_r, R_r, TARGET_MATRIX_r = get_calib(RIGHT_SOCKET, TOF_SOCKET, toFSize, (w, h))
                alignedFrame = depth_stereo_frame
                alignedDepth = getAlignedDepth(depth_tof_frame, M1_r, D1_r, M2_r, D2_r, T_r, R_r, TARGET_MATRIX_r, toFSize,(w, h))
                frame_counter += 1

                alignedDepthColorized = colorizeDepth(alignedDepth)
                alignedFrame = colorizeDepth(alignedFrame)
                #mapX, mapY = cv2.initUndistortRectifyMap(
                #    M2_r, D2_r, None, M2_r, (w,h), cv2.CV_32FC1
                #)
                #alignedFrame = cv2.remap(alignedFrame, mapX, mapY, cv2.INTER_LINEAR)
                cv2.imshow("Aligned Image", alignedFrame)
                cv2.imshow("Aligned depth Image", alignedDepthColorized)
                #cv2.waitKey(0)

                blended = cv2.addWeighted(alignedFrame, rgbWeight, alignedDepthColorized, depthWeight, 0)
                cv2.imshow(rgb_depth_window_name, blended)
                # Exit condition
                if cv2.waitKey(1) == ord('q'):
                    break

        device.close()
        print('Data collection complete.')

The script aims to align TOF to depth.

LMK if it works. If it doesn't, either intrinsics or extrinsics are bad.

Thanks,
Jaka

gdeanrexroth

jakaskerl
This is the error I ran into:
line 219, in <module>

**pipeline = create_pipeline(ALIGN_SOCKET)**

           **^^^^^^^^^^^^^^^^^^^^^^^^^^^^^**

File "s:\DEPT\SVM4\Shared\Crossfunctional_Work\Projects\DepthCameras\LuxonisDepthAI\test_run\jimmy.py", line 61, in create_pipeline

**tof.setNumShaves(4)**

**^^^^^^^^^^^^^^^^**

AttributeError: 'depthai.node.ToF' object has no attribute 'setNumShaves'

gdeanrexroth

jakaskerl
So I was able to only stream the rgb-depth align window and this was the results. I mentioned before that I was going to attempt to add in the point cloud node like you suggested. I am still working on it.

For the results below. I commented out all the things that are in the screenshot below. Can you or your team try this method? I think the alignment was at a still state due to lag. The multiple streams slowed down the live stream, since it had to process a lot.

jakaskerl

Hi gdeanrexroth
But the first script is sent worked for me so whatever you changed in the script caused the issue.
The first script from jakaskerl worked right?

Thanks,
Jaka

jakaskerl

gdeanrexroth
Update depthai version

gdeanrexroth

jakaskerl
Thank you for your response. I updated depthai within my conda environment, i was on version 2.25.1.0 and now I am at 2.28.0.0. I then proceed to run the code that you gave me and it successfully ran. However it was extremely lagging, I can only assume that the behavior occurred due to the multiple streams that were outputed…left,right image, rgb, depth tof, stereo and etc.

This is the results of what were outputted. Is this expected? Or do I still have to recalibrate my camera? In my opinion the rgb depth seems to improved once again, but since the multiple streams caused lag. I was unable to fully conclude much. Is it possible for me to edit this script to capture the rgb-depth ,stereo depth and tof depth and save them as .ply files to view them as a point cloud? I proved the real image so you can understand the environment that I am in and also see the objects that I am capturing. If I have said anything that is incorrect, please correct me 🙂.

jakaskerl

gdeanrexroth
Alignment looks good! The multiple streams are the cause of lag (displaying them doesn't help either). No need to recalibrate 🙂.

gdeanrexroth With the new script that you gave me, do I have to add in my intrinsic and extrinsic parameters as I did before in a format like this? I asked a similar question before and I remember you saying that I should add it to the/a script that captures the pcd.

The script that creates the pointcloud. If you wish to convert the depth to pointcloud, you need intrinsics. You can also just use the pointcloud node and this will all be done on device.

The layering your image shows, is because your depth is limited to 95 disparity values. If you turn on subpixel, this should be much better since you now have 754(don't quote me on the value) values which results in smoother transition.

Thanks,
Jaka

« Previous Page Next Page »