• Hardware
  • How to troubleshoot and improve PoE latency

I used the code and got the following results (4k, mjpeg, 30fps), which seems much slower than the official result (avg 77ms) reported from Erik (thanks for his support in the email).

Latency: 389.93 ms, Average latency: 379.09 ms, Std: 89.25
Latency: 408.26 ms, Average latency: 379.20 ms, Std: 89.11
Latency: 410.29 ms, Average latency: 379.31 ms, Std: 88.97
Latency: 394.89 ms, Average latency: 379.36 ms, Std: 88.82

1. Is there any suggested process to systematically troubleshoot the latency issue and improve the results? Our customer cares about the latency a lot.

2. After stopping (by pressing Ctrl+C) the above script and trying to run again, it could not find the device (with the error)? I tried at least 3 times but no luck, but we really need to provide something stable for our customer AI solution that would be deployed in their production pipeline.

    with dai.Device(pipeline) as device:
RuntimeError: No available devices

Some supplementary info

➜ python oak_bandwidth_test.py

Downlink 670.2 mbps
Uplink 231.0 mbps

➜ python oak_latency_test.py  
Sending buffer 0
Got buffer 0, latency 52.73ms
Sending buffer 1
Got buffer 1, latency 3.31ms
Sending buffer 2
...
Average latency 2.87 ms

PoE Switch spec: PS3108C

Cable (PC to switch) spec: DHC-CAT6-FTP-3M

PoE cable (Oak-D to switch) spec: ZBLZGP M12 8 Pin X Code Male to RJ45 Cat6a Ethernet Cable for Cognex Industrial Camera

Thanks,
Martin

  • erik replied to this.

    Hi MartinKu ,

    1. There are a few factors that play a part in POE latency, we have them documented in POE latency docs
    2. PoE cameras need about 10 seconds before they are available again, as after disconnecting, watchdog kicks in and resets the device, so everything needs to be re-initialized (networking stack as well).

    Thoughts?

      18 days later

      HI erik , thanks for your reply!

      1. By following the docs, I got the following results.


        For poe_test.py,

        Connecting to 169.254.1.222 ...

        mxid: 1844301061CEA20F00 (OK)

        speed: 1000 (OK)

        full duplex: 1 (OK)

        boot mode: 3 (OK)


        For oak_bandwidth_test.py,

        Downlink 482.3 mbps

        Uplink 212.5 mbps

        Press any key to continue...

      2. Understood.

      Additional context: I've run the scripts on a Ubuntu server which is connected with the switch, and the camera is also connected with the same switch. Note that both the switch and the Ubuntu 18.04 server did not connect to the Internet, so it's just a local LAN with only 3 devices (camera, switch, server), and I expect no other traffic that would interfere or saturate the traffic. It seems that the bottleneck is download link? But why it's slow? With no other traffic, I expect the bandwidth should be near direct link (e.g., ~800 mbps).

      • erik replied to this.

        Hi MartinKu ,
        My guess would be either the switch, the server, or the cable - your download link is quite low. If you eliminate an issue with cable/switch, one option would be to check NIC settings, and consult gpt4 about the settings to use. Thoughts?

          erik

          My current NIC settings:

          Features for enp2s0:

          rx-checksumming: on

          tx-checksumming: on

          tx-checksum-ipv4: on

          tx-checksum-ip-generic: off [fixed]

          tx-checksum-ipv6: on

          tx-checksum-fcoe-crc: off [fixed]

          tx-checksum-sctp: off [fixed]

          scatter-gather: off

          tx-scatter-gather: off

          tx-scatter-gather-fraglist: off [fixed]

          tcp-segmentation-offload: off

          tx-tcp-segmentation: off

          tx-tcp-ecn-segmentation: off [fixed]

          tx-tcp-mangleid-segmentation: off

          tx-tcp6-segmentation: off

          udp-fragmentation-offload: off

          generic-segmentation-offload: off [requested on]

          generic-receive-offload: on

          large-receive-offload: off [fixed]

          rx-vlan-offload: on

          tx-vlan-offload: on

          ntuple-filters: off [fixed]

          receive-hashing: off [fixed]

          highdma: on [fixed]

          rx-vlan-filter: off [fixed]

          vlan-challenged: off [fixed]

          tx-lockless: off [fixed]

          netns-local: off [fixed]

          tx-gso-robust: off [fixed]

          tx-fcoe-segmentation: off [fixed]

          tx-gre-segmentation: off [fixed]

          tx-gre-csum-segmentation: off [fixed]

          tx-ipxip4-segmentation: off [fixed]

          tx-ipxip6-segmentation: off [fixed]

          tx-udp_tnl-segmentation: off [fixed]

          tx-udp_tnl-csum-segmentation: off [fixed]

          tx-gso-partial: off [fixed]

          tx-sctp-segmentation: off [fixed]

          tx-esp-segmentation: off [fixed]

          tx-udp-segmentation: off [fixed]

          fcoe-mtu: off [fixed]

          tx-nocache-copy: off

          loopback: off [fixed]

          rx-fcs: off

          rx-all: off

          tx-vlan-stag-hw-insert: off [fixed]

          rx-vlan-stag-hw-parse: off [fixed]

          rx-vlan-stag-filter: off [fixed]

          l2-fwd-offload: off [fixed]

          hw-tc-offload: off [fixed]

          esp-hw-offload: off [fixed]

          esp-tx-csum-hw-offload: off [fixed]

          rx-udp_tunnel-port-offload: off [fixed]

          tls-hw-tx-offload: off [fixed]

          tls-hw-rx-offload: off [fixed]

          rx-gro-hw: off [fixed]

          tls-hw-record: off [fixed]

          GPT4 suggested:

          sudo ethtool -K enp2s0 sg on

          sudo ethtool -K enp2s0 tso on

          sudo ethtool -K enp2s0 gso on

          However, after applying the changes suggested by GPT4, I got a slightly worse result:

          Downlink 479.9 mbps

          Uplink 212.6 mbps

          • erik replied to this.

            erik I ran the cable test (a diagnostic feature of the switch), and the results look good (1G link status) for both connections. Therefore, I think the switch/cable are all good.

            After I ran the command "sudo ethtool -C enp2s0 rx-usecs 1022", the downlink has become 896 mbps

            Downlink 896.1 mbps

            Uplink 212.5 mbps

            FYR, my NIC settings before applying the command:

            Coalesce parameters for enp2s0:

            Adaptive RX: off TX: off

            stats-block-usecs: 0

            sample-interval: 0

            pkt-rate-low: 0

            pkt-rate-high: 0

            rx-usecs: 0

            rx-frames: 1

            rx-usecs-irq: 0

            rx-frames-irq: 0

            tx-usecs: 0

            tx-frames: 1

            tx-usecs-irq: 0

            tx-frames-irq: 0

            rx-usecs-low: 0

            rx-frame-low: 0

            tx-usecs-low: 0

            tx-frame-low: 0

            rx-usecs-high: 0

            rx-frame-high: 0

            tx-usecs-high: 0

            tx-frame-high: 0

            After applying the change:

            Coalesce parameters for enp2s0:

            Adaptive RX: off TX: off

            stats-block-usecs: 0

            sample-interval: 0

            pkt-rate-low: 0

            pkt-rate-high: 0

            rx-usecs: 960

            rx-frames: 0

            rx-usecs-irq: 0

            rx-frames-irq: 0

            tx-usecs: 0

            tx-frames: 1

            tx-usecs-irq: 0

            tx-frames-irq: 0

            rx-usecs-low: 0

            rx-frame-low: 0

            tx-usecs-low: 0

            tx-frame-low: 0

            rx-usecs-high: 0

            rx-frame-high: 0

            tx-usecs-high: 0

            tx-frame-high: 0

            But, even though the downlink has greatly improved, running the code again I still got much worse latency than yours (avg 287ms vs 77ms).
            Latency: 359.73 ms, Average latency: 287.30 ms, Std: 81.77

            Latency: 359.29 ms, Average latency: 287.34 ms, Std: 81.76

            Latency: 359.55 ms, Average latency: 287.38 ms, Std: 81.76

            • erik replied to this.

              MartinKu Unfortunately, I am not familiar enough with NIC / networking to be able to help here.