Borons locking up

We are seeing some issues with 2G 3G Borons locking up in the field.

We have an external RGB LED connected to the RGB signals on the Boron that mirrors the on board, when the device is locked up the LED is out, no breathing or flashing. Can only assume the on board LED is doing the same.

We have a external hardware watchdog on these devices which does not seem to be able to recover the device in this state, either the watchdog is getting serviced by the application thread, or is unable to reset the Boron.

If the customer goes to the device, unplugging the external power supply at the wall and plugging it back in will not recover it. Disconnecting the DC end of the power supply and then reconnecting will restart the device. There is some capacitance on the output of the external power supply but not much, usually unplugging at the wall will reboot the devices, but not in this case.

The couple I have looked at are running either DeviceOS 5.4.1 or 5.5.0

We have Argons using the same hardware and the same firmware (just rebuilt against Argon) and we are not seeing any issues with them.

Keep alive will be at the default value, but worst case we should be publishing at least every 10 minutes, usually more often than that.

Seems like something at silicon level maybe the cellular modem is getting locked and the pulling the OS down stopping the LED.

It's impossible to say what's wrong at this point.

Does pressing the reset button work? I presume it doesn't but if it does not, that explains why your watchdog is not working.

If the reset button is not resetting the device and the LED is off, it's typically either because you've entered shipping mode in the Boron PMIC (unlikely to do accidentally), or there is a power issue.

The most common power issue is if you have a way to remove 3V3 power, either by using the EN pin or removing VUSB/VIN and are not using the LiPo battery, but it's possible for circuitry to leak power into a GPIO. If this occurs, the leakage current will prevent the MCU from resetting until you completely remove all power. This is why the Tracker One added analog switches on the M8 GPIO pins between beta and release.

It looks like it is just 310 Borons we are seeing this on, 314 Borons seem OK, would that be correct?

@GJMorrison did you figure out the root cause for this issue?

We are seeing something similar on LTE borons. Devices become unresponsive in the field without warning, and it's not recoverable with our external watchdog or pressing the RST pin. However, the devices can be recovered by removing all power, after which they return to normal operation. We haven't been able to recreate this in our lab yet.

@rickkas7 I understand the backpower issue (which may happen in our device through comm pins), but do you have any ideas was RST isn't working?

When the nRF52 is in the back-fed power state, the hardware reset button and reset pin don't work, either. I'm not sure why, but that's the behavior.