Hello! While investigating frame timestamping on the OAK-D Pro PoE I discovered something a bit strange.
At 1080p, frames seem to take 24.4ms to arrive at a script node.
The frame sync documentation says at 1080p the IMX378 has a readout time of 16.54ms (which matches that sensor's max FPS of 60) and that the capture timestamp is at the start of the readout of the first row.
I decided to run the simplest program I could think of, to check this. Just a ColorCamera straight into a Script node; no heavy network traffic, no compression or ML to raise CPU loads.
import depthai as dai
import time
pipeline = dai.Pipeline()
camRgb = pipeline.create(dai.node.ColorCamera)
camRgb.setBoardSocket(dai.CameraBoardSocket.CAM_A)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setFps(30)
camRgb.initialControl.setAutoExposureLimit(5000)
stampScriptA = pipeline.create(dai.node.Script)
stampScriptA.setScript("""
node.warn("Timestamping script starts.")
i=0
while True:
frame = node.io['in'].get()
nowish = Clock.now()
if i%30 == 0:
readout_ms = (nowish.total_seconds()-frame.getTimestamp().total_seconds())*1000
node.warn(f"Frame {i} at {frame.getTimestamp().total_seconds():.4f} readout_ms={readout_ms:.1f} (nowish={nowish.total_seconds():.4f})")
if i>=100000:
i=0
i += 1
""")
camRgb.video.link(stampScriptA.inputs['in'])
with dai.Device(pipeline) as device:
print('getDeviceName:', device.getDeviceName())
print('getConnectedCameraFeatures:', device.getConnectedCameraFeatures())
print('resolution:', camRgb.getResolution())
#device.setTimesync(False)
time.sleep(30)
print("Test completed")
The output is as follows:
getDeviceName: OAK-D-PRO-POE
getConnectedCameraFeatures: [{socket: CAM_A, sensorName: IMX378, width: 4056, height: 3040, orientation: AUTO, supportedTypes: [COLOR], hasAutofocus: 1, hasAutofocusIC: 1, name: color}, {socket: CAM_B, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: left}, {socket: CAM_C, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: right}]
resolution: SensorResolution.THE_1080_P
[5.592] [Script(1)] [warning] Timestamping script starts.
[5.613] [Script(1)] [warning] Frame 0 at 5.3375 readout_ms=255.1 (nowish=5.5927)
[6.428] [Script(1)] [warning] Frame 30 at 6.4040 readout_ms=24.4 (nowish=6.4284)
[7.428] [Script(1)] [warning] Frame 60 at 7.4039 readout_ms=24.4 (nowish=7.4283)
[8.428] [Script(1)] [warning] Frame 90 at 8.4038 readout_ms=24.4 (nowish=8.4282)
[9.428] [Script(1)] [warning] Frame 120 at 9.4036 readout_ms=24.4 (nowish=9.4280)
[10.428] [Script(1)] [warning] Frame 150 at 10.4035 readout_ms=24.4 (nowish=10.4279)
[11.428] [Script(1)] [warning] Frame 180 at 11.4033 readout_ms=24.4 (nowish=11.4278)
[12.427] [Script(1)] [warning] Frame 210 at 12.4032 readout_ms=24.4 (nowish=12.4276)
[13.427] [Script(1)] [warning] Frame 240 at 13.4031 readout_ms=24.4 (nowish=13.4275)
[14.427] [Script(1)] [warning] Frame 270 at 14.4029 readout_ms=24.4 (nowish=14.4273)
[15.427] [Script(1)] [warning] Frame 300 at 15.4028 readout_ms=24.4 (nowish=15.4272)
[16.427] [Script(1)] [warning] Frame 330 at 16.4026 readout_ms=24.4 (nowish=16.4270)
[17.427] [Script(1)] [warning] Frame 360 at 17.4025 readout_ms=24.4 (nowish=17.4269)
[18.427] [Script(1)] [warning] Frame 390 at 18.4024 readout_ms=24.4 (nowish=18.4268)
[19.426] [Script(1)] [warning] Frame 420 at 19.4022 readout_ms=24.4 (nowish=19.4266)
[20.426] [Script(1)] [warning] Frame 450 at 20.4021 readout_ms=24.4 (nowish=20.4265)
[21.426] [Script(1)] [warning] Frame 480 at 21.4019 readout_ms=24.4 (nowish=21.4263)
[22.426] [Script(1)] [warning] Frame 510 at 22.4018 readout_ms=24.4 (nowish=22.4262)
[23.426] [Script(1)] [warning] Frame 540 at 23.4016 readout_ms=24.4 (nowish=23.4261)
[24.426] [Script(1)] [warning] Frame 570 at 24.4015 readout_ms=24.4 (nowish=24.4259)
[25.426] [Script(1)] [warning] Frame 600 at 25.4014 readout_ms=24.4 (nowish=25.4258)
[26.425] [Script(1)] [warning] Frame 630 at 26.4012 readout_ms=24.4 (nowish=26.4256)
[27.425] [Script(1)] [warning] Frame 660 at 27.4011 readout_ms=24.4 (nowish=27.4255)
[28.425] [Script(1)] [warning] Frame 690 at 28.4009 readout_ms=24.5 (nowish=28.4254)
[29.425] [Script(1)] [warning] Frame 720 at 29.4008 readout_ms=24.4 (nowish=29.4252)
[30.425] [Script(1)] [warning] Frame 750 at 30.4007 readout_ms=24.4 (nowish=30.4251)
[31.425] [Script(1)] [warning] Frame 780 at 31.4005 readout_ms=24.4 (nowish=31.4249)
[32.425] [Script(1)] [warning] Frame 810 at 32.4004 readout_ms=24.4 (nowish=32.4248)
[33.424] [Script(1)] [warning] Frame 840 at 33.4002 readout_ms=24.4 (nowish=33.4246)
[34.424] [Script(1)] [warning] Frame 870 at 34.4001 readout_ms=24.4 (nowish=34.4245)
Test completed
So the readout_ms is consistently 24.4ms - higher than I'd expect.
To confirm the readout_ms doesn't include the exposure time I tested at different exposure times; it made no difference to the readout_ms.
Testing with different resolutions, the readout_ms varied - but none of the options substantially reduce it:-
- 1. THE_12_MP -> 59.9 ms
- 2. THE_4_K -> 44.2 ms
- 3. THE_1080_P -> 24.4 ms
- 4. THE_1352X1012 -> 26.0 ms
- 5. THE_2024X1520 -> 22.3 ms
I also tested downscaling the 1080P image further, using camRgb.setIspScale(1, 2) expecting a quarter of the pixels to take a quarter of the time - but it only reduced the time from 24.4ms to 23.0ms
At a higher level, it's a bit weird there's a supported resolution of 4056x3040 @ 30fps, implying there's 2.9 gigabits per second of bandwidth from sensor to chip (assuming 8 bit bayer images) - one might expect a 1920x1080 image to read out in 5.6ms (assuming the same 8 bits per pixel bayer encoding)
Am I doing something wrong here? These latency figures seem high to me.