• DepthAI
  • Impact of Latency on Practical Applications

Hi YaoHui ,

and the yellow dots are the X, Y, and Z world coordinates directly output by OAK-D (images are not output).

How are you plotting XYZ (in meters) to XY (in pixels) image plane? To measure the latency between frame and NN result or between image and NN result arriving to the host computer you can compare these by message.getTimestamp().
Thanks, Erik

    Hi erik,

    How are you plotting XYZ (in meters) to XY (in pixels) image plane?

    In practice we need world coordinates. In order to compare the accuracy, we will convert the world coordinates to image coordinates on the host side, and compare the difference between the two.

    Currently tested, there will be latency of 170ms.
    Is there any way to reduce latency?

    • erik replied to this.

      Hi YaoHui , Is that the latency of the NN inferencing?

        Hi erik ,
        I don't think so, because I have measured the execution time of NN inference (14ms).

        • erik replied to this.

          YaoHui how did you measure that latency? Also which latency does the 170ms correspond to?

            Hi erik ,
            170ms is the time interval between the yellow and blue dots.
            Latency is estimated by displaying points on the image and counting the number of frames in between.

            • erik replied to this.

              Hi YaoHui ,

              Latency is estimated by displaying points on the image and counting the number of frames in between.

              How did you come to exactly 14ms? Did you have camera fps to 71 (1sec/71fps=14ms), and detections were exactly 1 frame behind the original frame?
              If not for NN, what would be the reason for the latency?

                Hi erik ,

                How did you come to exactly 14ms? Did you have camera fps to 71 (1sec/71fps=14ms), and detections were exactly 1 frame behind the original frame?
                If not for NN, what would be the reason for the latency?

                I use openvino benchmark.exe to verify the inference time of NN.
                In fact, the inference time of the NN is 12.38ms.
                I set the RGB, Stereo camera to 60FPS, so OAK-D should output results every 16m (it does).
                But the current problem is that the output results have a Latency of 170ms.

                • erik replied to this.

                  Hi YaoHui , Inference time of NN is not the same as the latency from inferencing to the host. And openvino benchmark also isn't the same as running a model on the OAK cam with depthai. And 170ms seems about what I would expect from object detection models.

                    Hi erik ,
                    The conclusion is that the 170ms latency occurs because the NN is inferring on the OAK cam with depthai, which is the overall program latency.
                    Do I understand this right?

                      Hi YaoHui ,
                      Another question; was the latency (12ms) measured on the actual camera (so VPU Movidius MyriadX), or cpu/gpu? Could you share the results? DepthAi and transfering through USB definately adds some latency (some docs here), but likely below 10ms. So my main guess would be the inference wasn't done on the actual OAK camera.

                        Hi erik ,

                        Inference on VPU Movidius MyriadX.
                        Below is the result obtained by Benchmark_app.exe.

                        Loading the model to the device
                        Load network took 1771.79 ms
                        Setting optimal runtime parameters
                        Device: MYRIAD
                        { NETWORK_NAME , torch-jit-export }
                        { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 4 }
                        { DEVICE_THERMAL , 34.6395 }
                        Creating infer requests and preparing input blobs with data
                        No input files were given: all inputs will be filled with random values!
                        Test Config 0
                        image  ([N,C,H,W], u8, {1, 3, 128, 128}, static):      random (image is expected)
                        Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
                        BENCHMARK IS IN INFERENCE ONLY MODE.
                        Input blobs will be filled once before performance measurements.
                        First inference took 12.21 ms

                        I have referenced the relevant literature (some docs here), since I did not export the image to my computer, should the file be of no reference value to me?
                        To verify whether the NN runs on the OAK cam, we conduct relevant tests in this part.
                        For example: Execute the program on the Raspberry Pi or NB, and the result is that the data can be output every 16ms.

                        10 days later

                        Hi erik ,

                        I can think of the conclusion as the overall operation process of OAK-D, and the overall latency will be determined by the use of DepthAI functions (Script, ImageManip, SpatialLocationCalculatorAlgorithm).

                        • erik replied to this.

                          Hi YaoHui ,
                          I haven't been able to repro this issue - because OpenVINO is PITA to work with - and I haven't come to any conclusion. But your depthi tests seems what I would expect as well, and results from openvino are a bit far-fetched, at least for send-to-receive result (so time since when you sent an image to the device to the time when NN results are returned to the host machine).
                          Thanks, Erik

                            Hi erik ,

                            Since we need to calculate the world coordinates, in order to make sure that performing depth estimation in OAK-D will not affect the overall speed.
                            We simulated fixing the depth distance (Z=600mm) and calculating the world coordinates on the host side (without using a Stereo lens) .
                            From this experiment, we found that when the depth estimation on OAK-D is cancelled, the overall delay can be effectively reduced.

                            • erik replied to this.

                              Hi YaoHui ,
                              If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?
                              What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.
                              Thanks, Erik

                                Hi erik ,

                                If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?

                                The experimental method proposed before is to complete all processes on OAK-D and only output world coordinates.
                                The experimental method proposed now is to complete all processes on OAK-D, output only image coordinates, and convert world coordinates (fixed depth distance Z) on the host side.

                                Both of the above methods will not output image information.

                                What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.

                                Subpixel + LR check. But from the documents, it can be seen that 96FPS can be maintained in these two modes.

                                • erik replied to this.

                                  Hi YaoHui ,
                                  I have tried locally as well, these are my results for 400P LR-check + Subpixel:

                                  FPS: 80.0, Latency: 24.03 ms, Average latency: 23.87 ms, Std: 0.85
                                  FPS: 80.0, Latency: 23.96 ms, Average latency: 23.87 ms, Std: 0.85
                                  FPS: 80.0, Latency: 23.76 ms, Average latency: 23.87 ms, Std: 0.85

                                  I have used this script:

                                  import depthai as dai
                                  import numpy as np
                                  from depthai_sdk import FPSHandler
                                  
                                  # Create pipeline
                                  pipeline = dai.Pipeline()
                                  # This might improve reducing the latency on some systems
                                  pipeline.setXLinkChunkSize(0)
                                  
                                  FPS = 80
                                  # Define source and output
                                  monoLeft = pipeline.create(dai.node.MonoCamera)
                                  monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
                                  monoLeft.setFps(FPS)
                                  monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
                                  
                                  monoRight = pipeline.create(dai.node.MonoCamera)
                                  monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
                                  monoRight.setFps(FPS)
                                  monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)
                                  
                                  stereo = pipeline.create(dai.node.StereoDepth)
                                  # stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
                                  # stereo.initialConfig.setMedianFilter(dai.MedianFilter.KERNEL_7x7)
                                  stereo.setLeftRightCheck(True)
                                  stereo.setExtendedDisparity(False)
                                  stereo.setSubpixel(True)
                                  
                                  # Linking
                                  monoLeft.out.link(stereo.left)
                                  monoRight.out.link(stereo.right)
                                  
                                  xout = pipeline.create(dai.node.XLinkOut)
                                  xout.setStreamName('out')
                                  stereo.depth.link(xout.input)
                                  
                                  fps = FPSHandler()
                                  
                                  # Connect to device and start pipeline
                                  with dai.Device(pipeline) as device:
                                      print(device.getUsbSpeed())
                                      q = device.getOutputQueue(name="out", )
                                      diffs = np.array([])
                                      while True:
                                          imgFrame = q.get()
                                          fps.nextIter()
                                          # Latency in miliseconds 
                                          latencyMs = (dai.Clock.now() - imgFrame.getTimestamp()).total_seconds() * 1000
                                          diffs = np.append(diffs, latencyMs)
                                          print('FPS: {:.1f}, Latency: {:.2f} ms, Average latency: {:.2f} ms, Std: {:.2f}'.format(fps.fps(), latencyMs, np.average(diffs), np.std(diffs)))

                                  Could you confirm the same on your side?
                                  Thanks, Erik

                                    Hi erik ,

                                    400P LR-check + Subpixel:

                                    FPS: 60.3, Latency: 20.97 ms, Average latency: 20.41 ms, Std: 0.96
                                    FPS: 60.3, Latency: 20.98 ms, Average latency: 20.41 ms, Std: 0.95
                                    FPS: 60.3, Latency: 20.92 ms, Average latency: 20.41 ms, Std: 0.95

                                    Simply testing depth images yields similar results to yours.
                                    But what I am more worried about is that when converting world coordinates, the depth image and the NN result need to wait for each other, resulting in an overall time delay.

                                    • erik replied to this.

                                      YaoHui yes that makes sense - what's the latency of your NN model? You could use a similar script, just replace StereoDepth node with NeuralNetwork node.
                                      Thanks, Erik