Now that I am able to use the tof sensor on the oak-d-sr-poe camera, I am now able to capture tof depth. I am struggling on the 2nd end of my task. Whenever I view the point cloud in rerun, MeshLab, open3d or any more tool. They are still unaligned, with unwanted artifacts. You can identify the object but it oblivions that it is not well aligned due to the colors and surrounding objects. I am sure this is not the camera fault, as I have calibrated it by following the documentation. ICP and Global registration seems to work, but since there are unwanted voxels or artifacts within the point cloud. The registration has a hard time picking the correct points to stitch together. I have tried other methods. Any suggestions would be helpful. The PCD's primarily look good, I just want to objects to be identifiable. Meaning that the objects are fully formed.

    jakaskerl

    Yes I am using open3d settings to align my point cloud. Here is an example of how I am applying it.
    import open3d as o3d

    import numpy as np

    import copy

    def draw_registration_result(source, target, transformation):

    source_temp = copy.deepcopy(source)
    
    target_temp = copy.deepcopy(target)
    
    source_temp.paint_uniform_color([1, 0.706, 0])  # Yellow
    
    target_temp.paint_uniform_color([0, 0.651, 0.929])  # Blue
    
    source_temp.transform(transformation)
    
    o3d.visualization.draw_geometries([source_temp, target_temp],
    
                                      zoom=0.4559,
    
                                      front=[0.6452, -0.3036, -0.7011],
    
                                      lookat=[1.9892, 2.0208, 1.8945],
    
                                      up=[-0.2779, -0.9482, 0.1556])

    # Load your point clouds

    source = o3d.io.read_point_cloud(r'file_path_to_pcd_1.pcd')

    target = o3d.io.read_pointcloud(r'file_path_to_pcd_2.pcd')

    # Visualize the initial alignment

    print("Initial alignment")

    draw_registration_result(source, target, np.identity(4))

    # Perform ICP registration

    threshold = 0.07 # Distance threshold, you may need to adjust this

    trans_init = np.identity(4) # Initial transformation

    reg_p2p = o3d.pipelines.registration.registration_icp(

    source, target, threshold, trans_init,
    
    o3d.pipelines.registration.TransformationEstimationPointToPoint())

    print("Point-to-point ICP registration result:")

    print(reg_p2p)

    print("Transformation is:")

    print(reg_p2p.transformation)

    # Visualize the result after ICP

    print("ICP result")

    draw_registration_result(source, target, reg_p2p.transformation)

    **I am using open3d with global and icp registration as well. I ran the both of my pcds in meshlab as well but the result was the same. Most of the items I am capturing do have a glare or shine in it. Based off some research I see that glare, plastic and some items that are reflective may conflict with my alignment of my point cloud. Correct me if I am wrong.

    Thanks 🙂**

      gdeanrexroth
      for example this a screenshot of a point cloud that i cropped with using open3d. The unaligned points/voxels are evident in the screenshot. ICP and Global both struggle to refine or align two pcds because of the unalignment(my assumption). From above/straight view, everything looks on the point cloud. However as we see in the other screenshot, the sides are wrapped with the appropriate skin of the bottle.

      Again I have two scripts that does the work here. I have a script that captures the tof depth and raw color feed from the camera that does some calculations and convert the images into a .pcd or .ply file. Then I have the second script that is reading in the file path of both point clouds to do pre processing work along with alignment.

      ToF Depth Perception (luxonis.com)
      As this article states the potential tof issues with reflective surfaces/transparent surfaces and the multiple path reflections. Would those issues affect alignment or registration? Most, if not all the material I am capturing will have a glare on it. But i am trying to create an environment that doesn't allow too much light where the light bounces off the material and the surface that it is on. However when I took depth of the a object with no glare, such as cardboard box(like how you guys did the box measurement example). The box point cloud looked very good. So I am curious if this is my main issue.

        gdeanrexroth
        Yes, the reflective and transparent surfaces are indeed a problem for ToF.

        gdeanrexroth But i am trying to create an environment that doesn't allow too much light where the light bounces off the material and the surface that it is on.

        That doesn't help. ToF emits it's own light.

        The spray bottle you are imaging looks like it has problems due to glare and transparency. if you try to wrap it in something non reflective, does it help?

        Thanks,
        Jaka

          jakaskerl
          Yes, non reflective objects such as the cardboard box comes out pretty well.

          See the example above. Still a little bit off but it is formed and also defined. Most of all the points are aligned, except for a few. I am unable to wrap the bottle, so i went with the next available thing to me.

          Now that I know about the issues that reflective objects/surfaces cause. Most of the material/objects will have a glare or some sort of reflection. Any suggestions on how I can bypass this. I will continue to use refinement and other alignment methods on non reflective objects to see if it combines accurately.

            gdeanrexroth
            Can't say, we are still figuring it out ourselves.. The refinement will need to be done on our side as well since the borders of the pointcloud are sometimes off by a few pixels (can be seen in your image as well)..

            Thanks,
            Jaka

              jakaskerl
              Thank you for response. With that being said, I assume the Luxonis team will work on solving this issue. But as for me, my project with the camera ends this month. I am sure that whenever this issue is fixed by you guys, you will update us. Does this affect the alignment issues that I have going on right now? Can I also assume that ICP and Global registration would not work in my case due to the material I capturing or it will work with some issues?

                gdeanrexroth Does this affect the alignment issues that I have going on right now?

                These are the issues (borders are not correctly aligned.

                gdeanrexroth Can I also assume that ICP and Global registration would not work in my case due to the material I capturing or it will work with some issues?

                The issue is that ICP and global registration use distances between pointclouds to compute the fused pointcloud. As far as I understand you would need custom processing that would allow you to give larger weight to depth pointcloud if there is high reflectivity since stereo generally works better in those circumstances and is more accurate.

                Thanks
                Jaka

                  @jakaskerl
                  Stereo depth was not better. ToF did improve once I removed extra lighting that was exposed to camera. I know that documentation mentioned some of the issues that both stereo and depth may face. A control environment in door may be the short time solution for me. As you stated that you guys will be testing out alignment issues and etc. So i should expect some misalignment still but as of present day, it is expected to be that way correct(even if the object is not reflective, shiny or transparent)?

                    gdeanrexroth
                    Some efforts were done to improve TOF-RBG alignment.

                    Script below has some additional undistortion of the RGB, the results were allegedly better. It seems to be an issue of alignment node in FW.

                    import numpy as np
                    import cv2
                    import depthai as dai
                    import time
                    from datetime import timedelta
                    
                    # This example is intended to run unchanged on an OAK-D-SR-PoE camera
                    FPS = 30.0
                    
                    RGB_SOCKET = dai.CameraBoardSocket.CAM_C
                    TOF_SOCKET = dai.CameraBoardSocket.CAM_A
                    ALIGN_SOCKET = RGB_SOCKET
                    
                    class FPSCounter:
                        def __init__(self):
                            self.frameTimes = []
                    
                        def tick(self):
                            now = time.time()
                            self.frameTimes.append(now)
                            self.frameTimes = self.frameTimes[-100:]
                    
                        def getFps(self):
                            if len(self.frameTimes) <= 1:
                                return 0
                            # Calculate the FPS
                            return (len(self.frameTimes) - 1) / (self.frameTimes[-1] - self.frameTimes[0])
                    
                    
                    pipeline = dai.Pipeline()
                    # Define sources and outputs
                    camRgb = pipeline.create(dai.node.ColorCamera)
                    tof = pipeline.create(dai.node.ToF)
                    camTof = pipeline.create(dai.node.Camera)
                    sync = pipeline.create(dai.node.Sync)
                    align = pipeline.create(dai.node.ImageAlign)
                    out = pipeline.create(dai.node.XLinkOut)
                    
                    # ToF settings
                    camTof.setFps(FPS)
                    camTof.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
                    camTof.setBoardSocket(TOF_SOCKET)
                    
                    # rgb settings
                    camRgb.setBoardSocket(RGB_SOCKET)
                    camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)
                    camRgb.setFps(FPS)
                    camRgb.setIspScale(1, 2)
                    
                    depthSize = (1280,800) #PLEASE SET TO BE SIZE OF THE TOF STREAM
                    rgbSize = camRgb.getIspSize()
                    
                    out.setStreamName("out")
                    
                    sync.setSyncThreshold(timedelta(seconds=0.5 / FPS))
                    rgbSize = camRgb.getIspSize()
                    
                    # Linking
                    camRgb.isp.link(sync.inputs["rgb"])
                    camTof.raw.link(tof.input)
                    tof.depth.link(align.input)
                    align.outputAligned.link(sync.inputs["depth_aligned"])
                    sync.inputs["rgb"].setBlocking(False)
                    camRgb.isp.link(align.inputAlignTo)
                    sync.out.link(out.input)
                    
                    def colorizeDepth(frameDepth):
                        invalidMask = frameDepth == 0
                        # Log the depth, minDepth and maxDepth
                        try:
                            minDepth = np.percentile(frameDepth[frameDepth != 0], 3)
                            maxDepth = np.percentile(frameDepth[frameDepth != 0], 95)
                            logDepth = np.log(frameDepth, where=frameDepth != 0)
                            logMinDepth = np.log(minDepth)
                            logMaxDepth = np.log(maxDepth)
                            np.nan_to_num(logDepth, copy=False, nan=logMinDepth)
                            # Clip the values to be in the 0-255 range
                            logDepth = np.clip(logDepth, logMinDepth, logMaxDepth)
                    
                            # Interpolate only valid logDepth values, setting the rest based on the mask
                            depthFrameColor = np.interp(logDepth, (logMinDepth, logMaxDepth), (0, 255))
                            depthFrameColor = np.nan_to_num(depthFrameColor)
                            depthFrameColor = depthFrameColor.astype(np.uint8)
                            depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)
                            # Set invalid depth pixels to black
                            depthFrameColor[invalidMask] = 0
                        except IndexError:
                            # Frame is likely empty
                            depthFrameColor = np.zeros((frameDepth.shape[0], frameDepth.shape[1], 3), dtype=np.uint8)
                        except Exception as e:
                            raise e
                        return depthFrameColor
                    
                    
                    rgbWeight = 0.4
                    depthWeight = 0.6
                    
                    
                    def updateBlendWeights(percentRgb):
                        """
                        Update the rgb and depth weights used to blend depth/rgb image
                    
                        @param[in] percent_rgb The rgb weight expressed as a percentage (0..100)
                        """
                        global depthWeight
                        global rgbWeight
                        rgbWeight = float(percentRgb) / 100.0
                        depthWeight = 1.0 - rgbWeight
                    
                    
                    
                    # Connect to device and start pipeline
                    remapping = True
                    with dai.Device(pipeline) as device:
                        queue = device.getOutputQueue("out", 8, False)
                    
                        # Configure windows; trackbar adjusts blending ratio of rgb/depth
                        rgbDepthWindowName = "rgb-depth"
                    
                        cv2.namedWindow(rgbDepthWindowName)
                        cv2.createTrackbar(
                            "RGB Weight %",
                            rgbDepthWindowName,
                            int(rgbWeight * 100),
                            100,
                            updateBlendWeights,
                        )
                        try:
                            calibData = device.readCalibration2()
                            M1 = np.array(calibData.getCameraIntrinsics(ALIGN_SOCKET, *depthSize))
                            D1 = np.array(calibData.getDistortionCoefficients(ALIGN_SOCKET))
                            M2 = np.array(calibData.getCameraIntrinsics(RGB_SOCKET, *rgbSize))
                            D2 = np.array(calibData.getDistortionCoefficients(RGB_SOCKET))
                    
                            try:
                                T = (
                                    np.array(calibData.getCameraTranslationVector(ALIGN_SOCKET, RGB_SOCKET, False))
                                    * 10
                                )  # to mm for matching the depth
                            except RuntimeError:
                                T = np.array([0.0, 0.0, 0.001])
                            try:
                                R = np.array(calibData.getCameraExtrinsics(ALIGN_SOCKET, RGB_SOCKET, False))[
                                    0:3, 0:3
                                ]
                            except RuntimeError:
                                R = np.eye(3)
                            TARGET_MATRIX = M1
                    
                            lensPosition = calibData.getLensPosition(RGB_SOCKET)
                        except:
                            raise
                        fpsCounter = FPSCounter()
                        while True:
                            messageGroup = queue.get()
                            fpsCounter.tick()
                            assert isinstance(messageGroup, dai.MessageGroup)
                            frameRgb = messageGroup["rgb"]
                            assert isinstance(frameRgb, dai.ImgFrame)
                            frameDepth = messageGroup["depth_aligned"]
                            assert isinstance(frameDepth, dai.ImgFrame)
                    
                            sizeRgb = frameRgb.getData().size
                            sizeDepth = frameDepth.getData().size
                            # Blend when both received
                            if frameDepth is not None:
                                rgbFrame = frameRgb.getCvFrame()
                                # Colorize the aligned depth
                                alignedDepthColorized = colorizeDepth(frameDepth.getFrame())
                                # Resize depth to match the rgb frame
                                cv2.putText(
                                    alignedDepthColorized,
                                    f"FPS: {fpsCounter.getFps():.2f}",
                                    (10, 30),
                                    cv2.FONT_HERSHEY_SIMPLEX,
                                    1,
                                    (255, 255, 255),
                                    2,
                                )
                                cv2.imshow("depth", alignedDepthColorized)
                                key = cv2.waitKey(1)
                                if key == ord("m"):
                                    if remapping:
                                        print("Remap turned OFF.")
                                        remapping = False
                                    else:
                                        print("Remap turned ON.")
                                        remapping = True
                    
                                if remapping:
                                    mapX, mapY = cv2.initUndistortRectifyMap(
                                        M2, D2, None, M2, rgbSize, cv2.CV_32FC1
                                    )
                                    rgbFrame = cv2.remap(rgbFrame, mapX, mapY, cv2.INTER_LINEAR)
                    
                                blended = cv2.addWeighted(
                                    rgbFrame, rgbWeight, alignedDepthColorized, depthWeight, 0
                                )
                                cv2.imshow(rgbDepthWindowName, blended)
                    
                            key = cv2.waitKey(1)
                            if key == ord("q"):
                                break

                    Thanks,
                    Jaka

                      jakaskerl
                      Nice!
                      Thank you, I will look at this updated file. I will try to implement this into my tof-depth point cloud script. However will you guys create or update the tof-depth script to align with the new changes?

                        jakaskerl

                        With adjustments being done to the TOF-RGB alignment script, does this affect the TOF-PointCloud script that your team has created?

                          @jakaskerl
                          My last question for now is in regard of the camera intrinsic and extrinsic values. I know that extrinsic values are retrieved from calibration and they do not affect the focal length or FOV. More so its location and orientation. In contrast of the intrinsic values which can be retrieved by a python script that you guys have created. The intrinsic parameters of a camera depend on how it captures the images. Parameters such as focal length, aperture, field-of-view, resolution, etc govern the intrinsic matrix of a camera model. Does the intrinsic or extrinsic value need to be in the script that captures the point cloud(pcd or ply file)? Or does it need to be implemented into the script that does the icp or global registration?

                          I assume tof-rgb alignment is not affected by the implementation of the values however I am unsure that these values do play an affect on how the camera captures point cloud or capture depth.

                          Python script from here:
                          For testing purposes I have added my intrinsic values and and pinhole intrinsic in bold to see if my point cloud would look any different.

                          import depthai as dai

                          import numpy as np

                          import cv2

                          import time

                          from datetime import timedelta

                          import datetime

                          import os

                          import sys

                          try:

                          import open3d as o3d

                          except ImportError:

                          sys.exit("Critical dependency missing: Open3D. Please install it using the command: '{} -m pip install open3d' and then rerun the script.".format(sys.executable))

                          FPS = 30

                          RGB_SOCKET = dai.CameraBoardSocket.CAM_C

                          TOF_SOCKET = dai.CameraBoardSocket.CAM_A

                          ALIGN_SOCKET = RGB_SOCKET

                          pipeline = dai.Pipeline()

                          # Define sources and outputs

                          camRgb = pipeline.create(dai.node.ColorCamera)

                          tof = pipeline.create(dai.node.ToF)

                          camTof = pipeline.create(dai.node.Camera)

                          sync = pipeline.create(dai.node.Sync)

                          align = pipeline.create(dai.node.ImageAlign)

                          out = pipeline.create(dai.node.XLinkOut)

                          pointcloud = pipeline.create(dai.node.PointCloud)

                          # Camera intrinsic parameters (ensure I am using the correct calibration values)

                          fx = 494.35192765 # Update with my calibrated value

                          fy = 499.48351759 # Update with my calibrated value

                          cx = 321.84779556 # Update with my calibrated value

                          cy = 218.30442303 # Update with my calibrated value

                          intrinsic = o3d.camera.PinholeCameraIntrinsic(width=640, height=480, fx=fx, fy=fy, cx=cx, cy=cy)

                          # ToF settings

                          camTof.setFps(FPS)

                          camTof.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)

                          camTof.setBoardSocket(TOF_SOCKET)

                          tofConfig = tof.initialConfig.get()

                          # choose a median filter or use none - using the median filter improves the pointcloud but causes discretization of the data

                          tofConfig.median = dai.MedianFilter.KERNEL_7x7

                          # tofConfig.median = dai.MedianFilter.KERNEL_5x5

                          # tofConfig.median = dai.MedianFilter.KERNEL_7x7

                          tof.initialConfig.set(tofConfig)

                          # rgb settings

                          camRgb.setBoardSocket(RGB_SOCKET)

                          camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)

                          camRgb.setFps(FPS)

                          camRgb.setIspScale(3,4)

                          depthSize = (1280,800) #PLEASE SET TO BE SIZE OF THE TOF STREAM

                          rgbSize = camRgb.getIspSize()

                          out.setStreamName("out")

                          sync.setSyncThreshold(timedelta(seconds=(0.5 / FPS)))

                          rgbSize = camRgb.getIspSize()

                          # Linking

                          camRgb.isp.link(sync.inputs["rgb"])

                          camTof.raw.link(tof.input)

                          tof.depth.link(align.input)

                          # align.outputAligned.link(sync.inputs["depth_aligned"])

                          align.outputAligned.link(pointcloud.inputDepth)

                          sync.inputs["rgb"].setBlocking(False)

                          camRgb.isp.link(align.inputAlignTo)

                          pointcloud.outputPointCloud.link(sync.inputs["pcl"])

                          sync.out.link(out.input)

                          out.setStreamName("out")

                          def colorizeDepth(frameDepth):

                          invalidMask = frameDepth == 0
                          
                          try:
                          
                              minDepth = np.percentile(frameDepth[frameDepth != 0], 3)
                          
                              maxDepth = np.percentile(frameDepth[frameDepth != 0], 95)
                          
                              logDepth = np.log(frameDepth, where=frameDepth != 0)
                          
                              logMinDepth = np.log(minDepth)
                          
                              logMaxDepth = np.log(maxDepth)
                          
                              np.nan_to_num(logDepth, copy=False, nan=logMinDepth)
                          
                              logDepth = np.clip(logDepth, logMinDepth, logMaxDepth)
                          
                              depthFrameColor = np.interp(logDepth, (logMinDepth, logMaxDepth), (0, 255))
                          
                              depthFrameColor = np.nan_to_num(depthFrameColor)
                          
                              depthFrameColor = depthFrameColor.astype(np.uint8)
                          
                              depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_JET)
                          
                              depthFrameColor[invalidMask] = 0
                          
                          except IndexError:
                          
                              depthFrameColor = np.zeros((frameDepth.shape[0], frameDepth.shape[1], 3), dtype=np.uint8)
                          
                          except Exception as e:
                          
                              raise e
                          
                          return depthFrameColor

                          rgbWeight = 0.4

                          depthWeight = 0.6

                          def updateBlendWeights(percentRgb):

                          global depthWeight
                          
                          global rgbWeight
                          
                          rgbWeight = float(percentRgb) / 100.0
                          
                          depthWeight = 1.0 - rgbWeight

                          with dai.Device(pipeline) as device:

                          isRunning = True
                          
                          q = device.getOutputQueue(name="out", maxSize=4, blocking=False)
                          
                          vis = o3d.visualization.VisualizerWithKeyCallback()
                          
                          vis.create_window()
                          
                          pcd = o3d.geometry.PointCloud()
                          
                          coordinateFrame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=1000, origin=[0,0,0])
                          
                          vis.add_geometry(coordinateFrame)
                          
                          first = True
                          
                          view_control = vis.get_view_control()
                          
                          while isRunning:
                          
                              inMessage = q.get()
                          
                              inColor = inMessage["rgb"]
                          
                              inPointCloud = inMessage["pcl"]
                          
                              cvColorFrame = inColor.getCvFrame()
                          
                              cvRGBFrame = cv2.cvtColor(cvColorFrame, cv2.COLOR_BGR2RGB)
                          
                              cv2.imshow("color", cvColorFrame)
                          
                              key = cv2.waitKey(1)
                          
                              if key == ord('q'):
                          
                                  break
                          
                              if key == ord('c'):
                          
                                  print("saving...")
                          
                                  current_time = datetime.datetime.now()
                          
                                  formatted_time = current_time.strftime("%Y_%m_%d_%H_%M_%S")
                          
                                  new_output = formatted_time
                          
                                  os.mkdir(new_output)
                          
                                  o3d.io.write_point_cloud(os.path.join(new_output, "tof_pointcloud.ply"), pcd)
                          
                                  cv2.imwrite(os.path.join(new_output, "Image_of_material.png"), cvColorFrame)
                          
                                  print(f"RGB point cloud saved to folder {new_output}")
                          
                              if inPointCloud:
                          
                                  points = inPointCloud.getPoints().astype(np.float64)
                          
                                  points[:, 1] = -points[:, 1]  # Invert Y axis
                          
                                  pcd.points = o3d.utility.Vector3dVector(points)
                          
                                  colors = (cvRGBFrame.reshape(-1, 3) / 255.0).astype(np.float64)
                          
                                  pcd.colors = o3d.utility.Vector3dVector(colors)
                          
                                  if first:
                          
                                      vis.add_geometry(pcd)
                          
                                      first = False
                          
                                  else:
                          
                                      vis.update_geometry(pcd)
                          
                              vis.poll_events()
                          
                              vis.update_renderer()
                          
                          vis.destroy_window()

                            apirrone Likely with depthai V3

                            gdeanrexroth Yup.

                            gdeanrexroth Does the intrinsic or extrinsic value need to be in the script that captures the point cloud(pcd or ply file)? Or does it need to be implemented into the script that does the icp or global registration?

                            In the pointcloud generation. The idea is to create a colorized pointcloud -- depth needs to be aligned to RGB. This can only be done by knowing the extrinsics from RGB to stereo, and intrinsics of rgb and the rectified stereo frames.

                            Thanks,
                            Jaka

                              jakaskerl

                              Nice, thank you for the update. Overall until you guys fix the tof-rgb alignment and release the depthai V3. I can assume that I will not be able to make any more progress on my current project utilizing the ToF senor to generate point cloud. Until the Luxonis team fixes this issue, correct? I've tried stereo depth to generate point cloud but it gave me similar results in regard of misaligned point cloud, but I will try again.

                              I believe that you placed a stereo depth point cloud in one the discussion threads that I have created. It does works however it generates the point cloud in a narrow shape. So I am still reading documentation on the different settings and parameters that are appropriate for me use case. Most of my objects will be 20cm to 50 cm away from the camera.