Hi there,
I have a device with an M404 running DeviceOS 5.9.0. SYSTEM_THREAD is ENABLED, and SYSTEM_MODE is SEMI_AUTOMATIC. It will occassinally (on the scale of weeks, but not consistently) go offline. When I connect it to my computer to read the serial output, my PC does not recognize the device and will not open the serial port. I beleive the last time I saw this, the LED was solid cyan. The only fix seems to be unplugging the device and plugging it back in.
I'm hoping to get advice on how to 1) find the source of the bug (e.g., logging tools, strategies, etc), and 2) try to error-proof this. Is there maybe a way in DeviceOS to make the system reboot itself in the event of a crash or an infinite loop?
I have a similar problem but with BRN404X. My device runs for a month, sometimes longer, and one day, it just goes offline. When I finally get to the device, I find it trying to connect to the cloud. Pressing the actual Reset button doesn't work, have to manually unplug it from power. My device went offline today, it's been over 10 hours and counting. Tomorrow morning, I will get to the device and hopefully reboot will get it to work again.
I also have a watchdog running, even though I reboot the device once a week by halting the watchdog and making it timeout, I tried soft reset but wasn't any different. I've been running deviceOS: 5.0.0, 5.8.0, 6.1.0 and now on 6.1.1
Just a small example:
void setup()
{
Watchdog.init(WatchdogConfiguration().timeout(60s));
Watchdog.start();
}
void loop()
{
Watchdog.refresh();
}
1 Like
Is your M404 connecting by cellular or Wi-Fi?
If Wi-Fi, it could be related to this issue which is still being investigated. It seems to be related to specific access points that cause this issue.
If the device does not respond to USB, it is either completely locked up, or in mutex deadlock between threads.
Using the hardware watchdog can help in this case.
Note that the hardware watchdog and the reset button do not reset the cellular modem. If you fail to connect for 10 minutes, the modem should be powered down to reset it, however.
The one case where this does not help is if you have run out of RAM. In this case, the modem will be reset, but the underlying problem of insufficient RAM has not been solved, so you will fail to connect again. Using an out of memory handler will catch this case.
Thanks, Rick.
The system in my case is primarily connecting over cellular, since there is no Wi-Fi network present. It does have Wi-Fi credentials saved from a previous test, and when I check the serial log, it looks like it occassoionally makes an attenpt ot connect to that saved SSID, but I believe it keeps the cellular connection.
I've implemented a hardware WDT, and I'll look into the out-of-memory handler.
It might be a while before I can tell if this seems to work, but if nothing happens, that's a good thing!
Does your system use BLE? there are many fixes for deadlocks using BLE in 6.2 (from 5.9).
Cheers,
@Hector56 I've seen similar behavior with a M404 that has had a WiFi network provisioned but no longer has that network available. If you can restore the device (or otherwise clear the WiFi credentials) you might see improved reliability.
Iām seeing the same behavior across my fleet of BRN404X devices. Are you able to capture the logs when this is happening?