Our AI model infers on OAK-D and outputs the desired result (60FPS). But at present, it is found that the actual output result seems to be slower than the expected output result, as [shown in the video](https://youtu.be/E7skyf6l4sA). The yellow dots in the video are OAK-D and the blue dots are the expected output. Thats mean when OAK-D performs AI model inference, the output result is not the current state (latency). ``` nn.input.setBlocking(False) nn.input.setQueueSize(1) ``` I have the settings according to the official recommendations, but still the same problem. Hope someone can help me!!

Hi @"YaoHui"#p5158 , NN results will always arrive to the host after the frame itself, so it's expected that there will be some lag between the two (above 100ms). To overcome this delay, I would suggest [syncing frames ](https://docs.luxonis.com/projects/api/en/latest/tutorials/message_syncing/)and NN results. Thoughts? Thanks, Erik

Hi @"erik"#p5174 , I don't need to output the frame to the host, just the X,Y,Z coordinates. Even so, do I still need to sync frames and NN results? Blue dot: Other Webcams Yellow dot: OAK-D [As shown in the video](https://youtu.be/E7skyf6l4sA), the yellow dot is always a small distance behind the blue dot...

@"YaoHui"#p5182 from my understanding - blue dot is obtained by getting frames from webcam and performing CV operations (eg. using opencv), whereas OAK-D was using NN and object detection to find the circle - is that the case? If so, that's to be expected, as NN infernecing takes about 100-200ms, whereas traditional CV (which works perfectly for such trivial "find circle" tasks) take a few ms. Or are you performing CV and just getting frames from OAK? If so, see [latency docs here](https://docs.luxonis.com/projects/api/en/latest/tutorials/low-latency/). Thanks, Erik

HI @"erik"#p5185 , The blue dots use the NN model and are inference by the host (images are taken from other webcams), and the yellow dots are the X, Y, and Z world coordinates directly output by OAK-D (images are not output). The NN inference time for the blue dot is greater than 16.6ms. The yellow dots are optimized to make NN inference less than 14ms. (Verified inference time with OpenVino) In theory it should be about the same, but from the results, the yellow dot seems to be more than 300 ms behind. From the literature, it's probably only about 105ms behind. Is there a workaround for this part?

Hi @"YaoHui"#p5186 , > and the yellow dots are the X, Y, and Z world coordinates directly output by OAK-D (images are not output). How are you plotting XYZ (in meters) to XY (in pixels) image plane? To measure the latency between frame and NN result or between image and NN result arriving to the host computer you can compare these by `message.getTimestamp()`. Thanks, Erik

Hi @"erik"#p5187, > How are you plotting XYZ (in meters) to XY (in pixels) image plane? In practice we need world coordinates. In order to compare the accuracy, we will convert the world coordinates to image coordinates on the host side, and compare the difference between the two. Currently tested, there will be latency of 170ms. Is there any way to reduce latency?

Hi @"YaoHui"#p5189 , Is that the latency of the NN inferencing?

Hi @"erik"#p5195 , I don't think so, because I have measured the execution time of NN inference (14ms).

@"YaoHui"#p5198 how did you measure that latency? Also which latency does the 170ms correspond to?

Hi @"erik"#p5200 , 170ms is the time interval between the yellow and blue dots. Latency is estimated by displaying points on the image and counting the number of frames in between.

Impact of Latency on Practical Applications

erik

Hi YaoHui ,
If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?
What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.
Thanks, Erik

YaoHui

Hi erik ,

If I understand correctly; instead of computing 3D coordinates on device (with Spatial Detection Network) you are just performing object detection on-device, and stream both results and depth to host and you calculated spatial coords on the host to reduce latency?

The experimental method proposed before is to complete all processes on OAK-D and only output world coordinates.
The experimental method proposed now is to complete all processes on OAK-D, output only image coordinates, and convert world coordinates (fixed depth distance Z) on the host side.

Both of the above methods will not output image information.

What stereo depth modes/postprocessing filters do you have enabled? Usually, they add additional delay.

Subpixel + LR check. But from the documents, it can be seen that 96FPS can be maintained in these two modes.

erik

Hi YaoHui ,
I have tried locally as well, these are my results for 400P LR-check + Subpixel:

FPS: 80.0, Latency: 24.03 ms, Average latency: 23.87 ms, Std: 0.85
FPS: 80.0, Latency: 23.96 ms, Average latency: 23.87 ms, Std: 0.85
FPS: 80.0, Latency: 23.76 ms, Average latency: 23.87 ms, Std: 0.85

I have used this script:

import depthai as dai
import numpy as np
from depthai_sdk import FPSHandler

# Create pipeline
pipeline = dai.Pipeline()
# This might improve reducing the latency on some systems
pipeline.setXLinkChunkSize(0)

FPS = 80
# Define source and output
monoLeft = pipeline.create(dai.node.MonoCamera)
monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoLeft.setFps(FPS)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)

monoRight = pipeline.create(dai.node.MonoCamera)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoRight.setFps(FPS)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)

stereo = pipeline.create(dai.node.StereoDepth)
# stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
# stereo.initialConfig.setMedianFilter(dai.MedianFilter.KERNEL_7x7)
stereo.setLeftRightCheck(True)
stereo.setExtendedDisparity(False)
stereo.setSubpixel(True)

# Linking
monoLeft.out.link(stereo.left)
monoRight.out.link(stereo.right)

xout = pipeline.create(dai.node.XLinkOut)
xout.setStreamName('out')
stereo.depth.link(xout.input)

fps = FPSHandler()

# Connect to device and start pipeline
with dai.Device(pipeline) as device:
    print(device.getUsbSpeed())
    q = device.getOutputQueue(name="out", )
    diffs = np.array([])
    while True:
        imgFrame = q.get()
        fps.nextIter()
        # Latency in miliseconds 
        latencyMs = (dai.Clock.now() - imgFrame.getTimestamp()).total_seconds() * 1000
        diffs = np.append(diffs, latencyMs)
        print('FPS: {:.1f}, Latency: {:.2f} ms, Average latency: {:.2f} ms, Std: {:.2f}'.format(fps.fps(), latencyMs, np.average(diffs), np.std(diffs)))

Could you confirm the same on your side?
Thanks, Erik

YaoHui

Hi erik ,

400P LR-check + Subpixel:

FPS: 60.3, Latency: 20.97 ms, Average latency: 20.41 ms, Std: 0.96
FPS: 60.3, Latency: 20.98 ms, Average latency: 20.41 ms, Std: 0.95
FPS: 60.3, Latency: 20.92 ms, Average latency: 20.41 ms, Std: 0.95

Simply testing depth images yields similar results to yours.
But what I am more worried about is that when converting world coordinates, the depth image and the NN result need to wait for each other, resulting in an overall time delay.

erik

YaoHui yes that makes sense - what's the latency of your NN model? You could use a similar script, just replace StereoDepth node with NeuralNetwork node.
Thanks, Erik

YaoHui

Hi erik ,

This is the latency of my Neural Network node.

FPS: 60.5, Latency: 19.09 ms, Average latency: 20.40 ms, Std: 0.95
FPS: 60.5, Latency: 20.96 ms, Average latency: 20.41 ms, Std: 0.95
FPS: 60.2, Latency: 20.95 ms, Average latency: 20.41 ms, Std: 0.95
FPS: 60.5, Latency: 19.02 ms, Average latency: 20.40 ms, Std: 0.95
FPS: 60.3, Latency: 20.85 ms, Average latency: 20.41 ms, Std: 0.95
FPS: 60.3, Latency: 20.87 ms, Average latency: 20.41 ms, Std: 0.95

But depthai.SpatialLocationCalculatorData() doesn't seem to support getTimestamp(), so I can't measure it.

erik

Hi YaoHui ,
The SpatialLocationCalculatorData does support getTimestamp() (see here), it was added 3 months ago, so it might be that you have an older version of depthai which doesn't support it.
Thanks, Erik

YaoHui

Hi erik ,
I updated the Depthai version and this is the time I measured.

FPS: 60.0, Latency: 36.25 ms, Average latency: 34.76 ms, Std: 11.60
FPS: 60.0, Latency: 30.63 ms, Average latency: 34.76 ms, Std: 11.60
FPS: 60.0, Latency: 30.45 ms, Average latency: 34.76 ms, Std: 11.60
FPS: 60.0, Latency: 38.91 ms, Average latency: 34.76 ms, Std: 11.59
FPS: 60.0, Latency: 47.45 ms, Average latency: 34.77 ms, Std: 11.60
FPS: 60.0, Latency: 32.94 ms, Average latency: 34.77 ms, Std: 11.59

Why is the latency of the first few data relatively high?

erik

Hi YaoHui ,
I assume because everything is initializing, and it takes some time before firmware starts processing frames "at full speed". At at the start (80ms) it's faster because there aren't other frames in the queue, but since it can't keep up the latency starts increasing until everything is fully initialized.
Thanks, Erik

YaoHui

Hi erik ,
Thank you for your patient reply.

From what I understand, the time currently measured is the time shown by the red line in the graph.
Is there a way to measure the latency of the whole architecture?

erik

YaoHui Latency would be time between capturing the frame - it's timestamp (see when timstamp is captured here) and when the message has arrived to the host computer. For NN/detections/spatial detections, result message will have the same timestmap/seq num as the frame it was inferenced upon.

« Previous Page