• DepthAICommunity
  • getAllAvailableDevices() reliably unreliable with multiple PoE devices

We're observing that:
auto devInfos = dai:😃eviceBootloader::getAllAvailableDevices();
and
auto devInfos = dai:😃evice::getAllAvailableDevices();
Are incredibly unreliable when probing PoE cameras. In a network with only two cameras it occasionally finds two, frequently finds one, and sometimes finds none (it's also been observed in networks with just one camera).
In an attempt to address this, we tried jacking the timeout values around:

DEPTHAI_BOOTUP_TIMEOUT=500000
DEPTHAI_SEARCH_TIMEOUT=500000
DEPTHAI_CONNECT_TIMEOUT=500000
DEPTHAI_LEVEL=warn

But tweaking the values doesn't seem to change the behavior.

The host is running Ubuntu 20.04 and connecting to the target network via WiFi.

A quick look at the group suggests that this particular member function may not be all that robust, but I didn't see anything that looked quite like this one...

  • erik replied to this.

    Hi wiley42 ,
    We apologize for the issues. A few points you could try:

    • Are you querying directly after disconnecting? OAK POE needs about 10sec to reboot and be available again
    • Are you running some other pipeline? As getAllAvailableDevices() will only return devices that are ready to connect, perhaps getAllConnectedDevices() would be preferred.
    • WiFi connectivity might not be the best - see documentation here

    Thoughts?
    Thanks, Erik

    Hey Erik,

    As ever, thanks for the support.

    No, I can roll in after they've been idle for hours and the call will fail. We did tumble to the fact that it takes a while for them to come back to a discoverable state, which actually changed the way we go about probing them; my original plan was to boot them just long enough to discover the type of camera (which only seems available via a Device() object), but with configurations with as many as eight cameras, booting all of them twice got unwieldy, so I just wait until I go to load the pipelines and if the camera in question doesn't have something that the specific pipeline needs (like an IMU to figure out if the camera is pointing the wrong way) I just deal with it asynchronously.

    No, no pipelines running. That said, let me switch to getAllConnectedDevices and see if the observed behavior changes.

    I'd tripped over the doc in question and didn't see anything that we weren't already doing. However, until last night we've been running these things as they come out of the box using DHCP, or programmatically setting their IPV4 and netmask using a stand-alone tool we tossed together, and it wasn't until I built the Python side of the house to use the configuration tool that the dim bulb went on and I realized these things don't ship with their Mac address set. I can imagine that causing a variety of issues, so I'll be dealing with that today. I guess I'll make them look like Nests or RIng doorbells or somethin' 😉

    Thanks again!

    • erik replied to this.

      Hi wiley42 ,
      Interesting, there are some reports where users weren't able to connect to their POE cameras where they were idle for some time, but I don't think we were able to repro it locally. Let us know how the MAC addr setting goes, and if that helps, otherwise we might want to recheck our automated tests as well.
      Thanks, Erik

      Hey Erik,

      So switching to getAllConnectedDevices didn't change the behavior, nor did setting a static IPV4 + netmask. However, setting a AC😃E:48:XX:XX:XX mac address seems to have fixed it, although I need to wait about 40 seconds after exit before trying to re-probe the cameras for it to work reliably. I'm guessing that part of that time is some sort of timeout on the part of the camera, because I'm just killing the process, which isn't going to call the destructor for the Device() class and so shutdown at the XLink level isn't happening nicely.

      I neglected to mention earlier that the inability to find the cameras isn't limited to my code; the example programs fail, and the Python-based configuration tool can't reliably find them, either.

      I'll follow-up once I have a better sense of if this is really a fix, or if the winds just happen to be blowing in a friendly direction at the moment.

      Sooo, demonstrating the dangers of trying to prove a negative, after idling for a while I found myself back in a state where neither the configuration utility nor my code could reliably find all of the cameras. I've got a few other things to burn through first, but my next move is to drop wifi and see if that (for some weird reason) changes the behavior.