• DepthAI-v2
  • Impact of Latency on Practical Applications

Hi YaoHui ,
I haven't been able to repro this issue - because OpenVINO is PITA to work with - and I haven't come to any conclusion. But your depthi tests seems what I would expect as well, and results from openvino are a bit far-fetched, at least for send-to-receive result (so time since when you sent an image to the device to the time when NN results are returned to the host machine).
Thanks, Erik

    Hi erik ,

    Since we need to calculate the world coordinates, in order to make sure that performing depth estimation in OAK-D will not affect the overall speed.
    We simulated fixing the depth distance (Z=600mm) and calculating the world coordinates on the host side (without using a Stereo lens) .
    From this experiment, we found that when the depth estimation on OAK-D is cancelled, the overall delay can be effectively reduced.

    • erik replied to this.

      Hi YaoHui ,
      If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?
      What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.
      Thanks, Erik

        Hi erik ,

        If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?

        The experimental method proposed before is to complete all processes on OAK-D and only output world coordinates.
        The experimental method proposed now is to complete all processes on OAK-D, output only image coordinates, and convert world coordinates (fixed depth distance Z) on the host side.

        Both of the above methods will not output image information.

        What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.

        Subpixel + LR check. But from the documents, it can be seen that 96FPS can be maintained in these two modes.

        • erik replied to this.

          Hi YaoHui ,
          I have tried locally as well, these are my results for 400P LR-check + Subpixel:

          FPS: 80.0, Latency: 24.03 ms, Average latency: 23.87 ms, Std: 0.85
          FPS: 80.0, Latency: 23.96 ms, Average latency: 23.87 ms, Std: 0.85
          FPS: 80.0, Latency: 23.76 ms, Average latency: 23.87 ms, Std: 0.85

          I have used this script:

          import depthai as dai
          import numpy as np
          from depthai_sdk import FPSHandler
          
          # Create pipeline
          pipeline = dai.Pipeline()
          # This might improve reducing the latency on some systems
          pipeline.setXLinkChunkSize(0)
          
          FPS = 80
          # Define source and output
          monoLeft = pipeline.create(dai.node.MonoCamera)
          monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
          monoLeft.setFps(FPS)
          monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
          
          monoRight = pipeline.create(dai.node.MonoCamera)
          monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
          monoRight.setFps(FPS)
          monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)
          
          stereo = pipeline.create(dai.node.StereoDepth)
          # stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
          # stereo.initialConfig.setMedianFilter(dai.MedianFilter.KERNEL_7x7)
          stereo.setLeftRightCheck(True)
          stereo.setExtendedDisparity(False)
          stereo.setSubpixel(True)
          
          # Linking
          monoLeft.out.link(stereo.left)
          monoRight.out.link(stereo.right)
          
          xout = pipeline.create(dai.node.XLinkOut)
          xout.setStreamName('out')
          stereo.depth.link(xout.input)
          
          fps = FPSHandler()
          
          # Connect to device and start pipeline
          with dai.Device(pipeline) as device:
              print(device.getUsbSpeed())
              q = device.getOutputQueue(name="out", )
              diffs = np.array([])
              while True:
                  imgFrame = q.get()
                  fps.nextIter()
                  # Latency in miliseconds 
                  latencyMs = (dai.Clock.now() - imgFrame.getTimestamp()).total_seconds() * 1000
                  diffs = np.append(diffs, latencyMs)
                  print('FPS: {:.1f}, Latency: {:.2f} ms, Average latency: {:.2f} ms, Std: {:.2f}'.format(fps.fps(), latencyMs, np.average(diffs), np.std(diffs)))

          Could you confirm the same on your side?
          Thanks, Erik

            Hi erik ,

            400P LR-check + Subpixel:

            FPS: 60.3, Latency: 20.97 ms, Average latency: 20.41 ms, Std: 0.96
            FPS: 60.3, Latency: 20.98 ms, Average latency: 20.41 ms, Std: 0.95
            FPS: 60.3, Latency: 20.92 ms, Average latency: 20.41 ms, Std: 0.95

            Simply testing depth images yields similar results to yours.
            But what I am more worried about is that when converting world coordinates, the depth image and the NN result need to wait for each other, resulting in an overall time delay.

            • erik replied to this.

              YaoHui yes that makes sense - what's the latency of your NN model? You could use a similar script, just replace StereoDepth node with NeuralNetwork node.
              Thanks, Erik

                Hi erik ,

                This is the latency of my Neural Network node.

                FPS: 60.5, Latency: 19.09 ms, Average latency: 20.40 ms, Std: 0.95
                FPS: 60.5, Latency: 20.96 ms, Average latency: 20.41 ms, Std: 0.95
                FPS: 60.2, Latency: 20.95 ms, Average latency: 20.41 ms, Std: 0.95
                FPS: 60.5, Latency: 19.02 ms, Average latency: 20.40 ms, Std: 0.95
                FPS: 60.3, Latency: 20.85 ms, Average latency: 20.41 ms, Std: 0.95
                FPS: 60.3, Latency: 20.87 ms, Average latency: 20.41 ms, Std: 0.95

                But depthai.SpatialLocationCalculatorData() doesn't seem to support getTimestamp(), so I can't measure it.

                • erik replied to this.

                  Hi YaoHui ,
                  The SpatialLocationCalculatorData does support getTimestamp() (see here), it was added 3 months ago, so it might be that you have an older version of depthai which doesn't support it.
                  Thanks, Erik

                    Hi erik ,
                    I updated the Depthai version and this is the time I measured.

                    FPS: 60.0, Latency: 36.25 ms, Average latency: 34.76 ms, Std: 11.60
                    FPS: 60.0, Latency: 30.63 ms, Average latency: 34.76 ms, Std: 11.60
                    FPS: 60.0, Latency: 30.45 ms, Average latency: 34.76 ms, Std: 11.60
                    FPS: 60.0, Latency: 38.91 ms, Average latency: 34.76 ms, Std: 11.59
                    FPS: 60.0, Latency: 47.45 ms, Average latency: 34.77 ms, Std: 11.60
                    FPS: 60.0, Latency: 32.94 ms, Average latency: 34.77 ms, Std: 11.59

                    Why is the latency of the first few data relatively high?

                    • erik replied to this.

                      Hi YaoHui ,
                      I assume because everything is initializing, and it takes some time before firmware starts processing frames "at full speed". At at the start (80ms) it's faster because there aren't other frames in the queue, but since it can't keep up the latency starts increasing until everything is fully initialized.
                      Thanks, Erik

                        Hi erik ,
                        Thank you for your patient reply.

                        From what I understand, the time currently measured is the time shown by the red line in the graph.
                        Is there a way to measure the latency of the whole architecture?

                        • erik replied to this.

                          YaoHui Latency would be time between capturing the frame - it's timestamp (see when timstamp is captured here) and when the message has arrived to the host computer. For NN/detections/spatial detections, result message will have the same timestmap/seq num as the frame it was inferenced upon.