• RAE
  • No chance in getting a WLAN connection & device gets VERY hot

I just got my rae.

Tried to set it up with RobotHub according to instructions (QR code): QR code is received, display states "connecting to <my SSID>", device flashes first blue, then red, then turns white - but no connection to my WLAN (device doesn't show up in my WLAN or RobotHub).

Tried to set it up manually according to instructions: SSID & password were obviously correctly received, because the correct information is in /etc/wpasupplicant.conf. BTW, there is an error in the instructions, the quotation marks in the example for the passPHRASE are wrong, use only if you use the passWORD. /etc/systemd/network/20-wifi.network has the right content, too. Stopping hostapd was never needed, it is always dead after startup. When trying to connect to my WLAN with wpasupplicant b -i wlp1s0 (…) i start getting lots of error messages - from missing or invalided devices, to already running (or hanging) instances… Error messages seem to depend a bit on the time of day and what I tried before - I can provide them, if there is anyone who can help me. This seems to be the root cause of the issue, because I don't get beyond that step in the manual setup and I guess the "QR code method" is trying to do pretty much the same in the background.

I do get wlp1s0 shown in the device list for ifconfig & Co.), but obviously never with a connection to my WLAN.

When trying to scan for WLANs with wpa_cli, I don't get any output (and there are plenty of WLANs around), which strikes me as weird, because that should IMO work, even without any WLAN connection. So maybe THIS is the root cause and something is wrong with the WLAN adapter?

Last observation, not sure if this might be related: My device gets VERY hot. You need to be quick when picking it up for shutting it down…! I don't think, that it should be that hot. After some time, I started to get kicked out of the ssh session too, so I suspect a bit, the device might freeze/lock-up due to overheating.

The display is strange, too: It starts up fine (backlit -> "rae" -> small robot on white background -> "Register…" on white background, but then the display starts to "flicker", constantly changing ist brightness until it's gone. Looks a bit like a "fade out" effect with 0.5 Hz … I doubt very much, that this is as it should be, too.

Any ideas, tips or recommendations? Some more info I shall provide? I have no previous experience with Archlinux, but I am otherwise fairly proficient on Linux (Ubuntu/Debian), so fire away!

With respect to temperature and display: Might I simply have a HW defect?

  • Mike replied to this.

    DiMa … I can confirm that my Rae gets slightly warm after an hours use and charging but definitely not hot. I suspect a H/W fault ... my Rae's display is constant, no fading etc.

    Have you tried a hard reset with the reset button and then updating to FW v1.13 (using the USB cable connection) ?

    Definetly more than "slightly warm"… 😉 No, I did not try a "hard reset" - how do I do that? And how do I install new FW (and where do I get it) - I guess that's somewhere in the Wiki, but I did not yet stumble upon it…

    EDITH: Found it - just in case I am not the only one looking in different places… Instructions are here.

    • Mike replied to this.

      There are two versions, which one should I pick - the newer one (from October, 2nd)? How does the installation work, when I have no internet connection, i.e. have to scp the files onto the rae? I did find instructions for pulling the FW via mender only.

      Got one step further after hard resetting: QR scan method etc. works, I triple checked that. I now get a proper "Couldn't connect" message with the red blinking, too. With the "fresh" device, I can now see all my local WLAN APs, too, so the WLAN interface does seem to work, too:

      wpa_cli v2.9

      Copyright (c) 2004-2019, Jouni Malinen <j@w1.fi> and contributors

      This software may be distributed under the terms of the BSD license.

      See README for more details.

      Selected interface 'p2p-dev-wlp1s0'

      Interactive mode

      <3>CTRL-EVENT-SCAN-RESULTS

      scan_results

      > bssid / frequency / signal level / flags / ssid

      82:a7:41:ec:80:f1 5220 -41 [WPA2-PSK-CCMP][ESS] lannisport_iot

      76:a7:41:ec:80:f1 5220 -77 [WPA2-PSK-CCMP][ESS] lannisport

      7e:a7:41:ec:80:f1 5220 -77 [WPA2-PSK-CCMP][ESS] lannisport_restricted

      7e:a7:41:ec:80:f0 2462 -83 [WPA2-PSK-CCMP][ESS] lannisport_iot

      7a:a7:41:ec:80:f0 2462 -44 [WPA2-PSK-CCMP][ESS] lannisport_restricted

      7a:a7:41:ec:80:f1 5220 -77 [ESS] eighteenguests

      76:a7:41:ec:80:f0 2462 -43 [ESS] eighteenguests

      86:a7:41:ec:80:f0 2462 -83 [WPA2-PSK-CCMP][ESS]

      70:a7:41:ec:80:f0 2462 -44 [WPA2-PSK-CCMP][ESS] lannisport

      82:a7:41:ec:80:f0 2462 -83 [WPA2-PSK-CCMP][ESS]

      ea:63:da:a4:f3:fb 2462 -79 [WPA2-PSK-CCMP][ESS] lannisport_restricted

      ee:63:da:a4:f3:fb 2462 -80 [WPA2-PSK-CCMP][ESS] lannisport_iot

      e0:63:da:a4:f3:fb 2462 -82 [WPA2-PSK-CCMP][ESS] lannisport

      e6:63:da:a4:f3:fb 2462 -79 [ESS] eighteenguests

      I have three APs, all providing the same four SSIDs (lannisport_iot, lannisport:restricted, lannisport, and eighteenguests). Could this be the issue, that the rae doesn't support any roaming or fails to connect to an SSID, when multiple APs are available to connect to it? It seems to "see" the nearest AP with both 2.4 & 5 GHz bands plus one other with 2.4 GHz only. According to the documentation, wpa_supplicant should support roaming, though.

      Hah! Found it! hostapd is blocking wlp1s0! After stopping the service, the link to my WLAN comes up. This is reproducible, since hostapd is running after each reboot… I am not too familiar with systemd and not at all with hostapd: Quick hint or pointer on how to disable the automatic start on reboot?

      @Mike Do you have any hint which 1.13 to pick? Running on 1.12. And which URL do I need to provide to mender: To the directory, to the *.mender file…(I guess: the latter)? Sorry if these are stupid questions, but "mender -install <link_to_firmware>" isn't that conclusive with respect to what the link should point to 😉

      BTW: I just measured the surface tempratire (just on the top of the rae): 54°C after ca. 30-45min fighting with the WLAN config, see above. Doesn't sound right to me. I am now starting to get kicked out of the ssh session again and can't reconnect…

      root@keembay:~# wpa_supplicant -i wlp1s0 -c /etc/wpa_supplicant.conf -B

      Successfully initialized wpa_supplicant

      root@keembay:~# client_loop: send disconnect: Connection reset

      Shutdown via double click stopped working, too, only hard shutdown will work then.

      • Mike replied to this.

        Don't get further, the device is mostly dead. When starting up, I got the logo and then the LEDs where "blinking" white, but not in a good way. Looked more like something was wrong with the process controlling the RGB-LEDs. Devices doesn't react to power button at all - neither double-click, nor pressing 8s (or longer). Had to wait until the battery died…

        Sent a mail to support, let's see how they react. So far, rae is a very frustrating experience…

          Hi DiMa
          Very likely it's a hardware issue. Thanks for emailing support and sorry for the inconvenience.

          Regards,
          Jaka

          I can confirm the RAE becomes hot while running the ROS stack. It's "slightly warm" only in idle mode.

          Do you have any bundled scripts to check the temperature, as you can do on raspi?

          There are a few folders around here … this gives a number …

          cat /sys/class/thermal/thermal_zone1/temp

            Ok, I tested it, and here are the results:

            • The first experiment was with a bringup.launch. I've just started the most recent image, w/o load, idle mode. I stopped the process after ~ an hour, and one of the thermal zones reached 60 degrees. You can ignore the errors in the video, as the entire stack is not yet working.
            • The second experiment was triggered several minutes after the previous one. But that time, I ran robot.launch with an additional load on motors (teleop). As a result, the same thermal zone that ended at 60 degrees quickly reached 65 degrees in just 6 minutes. Then, I shut the nodes down to avoid hardware damage.

            One important observation: as you can see in both videos, I also printed the cooling devices' state apart from the thermal data. Zero index device is related to VPU. Its initial state was equal to 5 when I started the nodes. But then it quickly switched to 0 and was never restored. I wonder if 0 means the absence of cooling. But then it seems weird that it goes off when we give some load. Anyway, the second experiment clearly shows that the temperature is constantly increasing under common teleoperation. And it's definitely abnormal.

            Just played a little bit with RViz, map, laser scans, cameras, etc.

            • Mike replied to this.

              sskorol … without knowing where/what the thermal zones represent it is dangerous to assume 65 degC is outside the operating temps … it is most likely the CPU cores which can go above 60 degC typically … if it was case temp then I would be concerned … but we should wait until Luxonis publishes some specs on this before we say Rae is about to go into a critical meltdown ,,,

              @Mike yeah, I mean, I know that in RPi for instance CPU temp can go up to 85 degrees. But when I see 71 (like on the second screen) w/o motors usage, with a tendency to growth, it becomes suspicious. And yes, the whole case is hot. Keeping in mind it’s metal, it’s not comfortable to touch/hold it in hands at all. My guess the main temperature spike comes from cameras. If you’ve ever worked with Luxonis devices, you know how hot they might become. I have several oak-d cameras. And it’s hard to touch them when they are in use. If their radiators touch the case, then I’d say that’s the main reason, why it’s so hot. Maybe it was intentional by design. But I don’t really want my kid to accidentally touch it while playing.

              I adjusted the script to print the type of cooling device and thermal zone (idle mode):

              I'm slightly confused because top temperature values (based on the previous experiments) come from the battery gauge and Wi-Fi zones, while VPU cooling drops to zero. Is it even safe?

              As expected… When I reached the following numbers by running a full RAE stack in docker (idle mode), and then ran the teleop node, RAE rebooted in a couple of seconds after triggering the motors.

              • Mike replied to this.