Hardware watchdog timer in the nRF52840 or RTL827x MCU

Hi,

In the specific case of a Boron, am I correct in thinking the following?

The onboard hardware watchdog timer CANNOT save the system from all connectivity failures BECAUSE the most we can do in firmware is System.reset(); and that type of reset does not reset the modem.


Side question, I just read this on the docs:

RTL872x platform (P2, Photon 2): If the onExpired() callback is used, the device will not automatically reset.

On Photon 2s, can we call System.Reset() from the callback?


I’ve only just noticed this note, so I suppose that’s a warning to heed:

Due to the limitations of onExpired() and the difference between platforms, we recommed not using this feature.

Thanks

The hardware watching is for resetting the system if it locks up, typically due to deadlock, but it could be some other condition that stops the MCU, such as an infinite loop with interrupts disabled.

As long as you wait the suggested 10 minutes on connection failures, Device OS will reset the cellular modem by powering it down.

The only other condition to make sure of is you have an out of memory handler so you can reset the device if you run out of memory. A lack of memory can also prevent connecting, and resetting the modem does not reset the device or free memory. This is not done automatically because some firmware may want to do other things to free memory, or may be doing something critical where a device reset would be a problem.

oh I didn't know that, can you elaborate or point me to the docs, please?
Thanks

If you fail to connect to the cellular or the cloud for 10 minutes (blinking green or blinking cyan), the modem will be reset. This typically involves attempting to gracefully shut down the modem, then power cycling it. The device itself is not reset.

The situation that can cause problems if you sleep with network active after a shorter duration of failing to connect to cellular, it's possible that the modem will never reset because you never get to 10 minutes.

If you sleep with cellular off, it also will power down the modem, so that's usually sufficient as well.

1 Like

@rickkas7 , Does the 10 minute reset apply only to the cell modem or does it reset the wifi/BLE radios as well?

Only cellular.

On nRF52, BLE is integrated with the MCU so the only way to power that down is to reset the entire MCU.

On the P2/Photon 2 and M-SoM since the Wi-Fi and BLE are part of the MCU, you'd need to reset the entire MCU.

I've added code on the Photon 1 to reset the device by going into hibernate sleep mode for 30 seconds to completely reset the device and Wi-Fi. It's not usually necessary, but I suppose it wouldn't hurt as a backup.

1 Like

Coming back to cellular, would you say that a boron sent to hibernation for 30 seconds would be enough for resetting the celullar modem?
Thanks!

This is good to know… but what firmware version was this introduced? It was my understanding the external AB1805 RTC and the deepPowerDown() method was constructed specifically due to needing to power cycle the modem for 30 seconds. Maybe that was an earlier way to do the mod reset but it then was moved to device OS?

What’s the current recommendation/best practice regarding the AB1805 external RTC?

@gusgonnet - not sure if your trying to simplify hardware design but since I’ve included the AB1805 external Watchdog and the deep power down for 30 seconds if unable to connect, I’ve had minimal issues with devices going offline, the only exception was when the device thinks it’s online but data stops sending. For this I use the software watchdog on the device. I use the ACK of a Particle.Publish to pet the software watchdog. The combination of these two things have made things fairly robust.

2 Likes

Hey Jeff, thanks for the insights. I'm asking for several reasons: to know more about the subject, to increase robustness in existing projects and simplify hardware design in future projects.

2 Likes

I have a Boron 404X running in a location that I don't have easy physical access to. I've programmed it to detect loss of cloud access and if gone for more than an hour I call System.reset(RESET_NO_WAIT);

It's been running for many months working fine, and having a couple of cloud losses a day on average eventually reconnecting to the cloud. It does the job.

However, a month or so ago it stopped reconnecting and when I eventually had an opportunity to travel to it, I found the modem had stopped responding even after a System.reset(RESET_NO_WAIT);
See a section of trace (I don't have the full trace leading up to the lockup):

Opening serial monitor for com port: " /dev/ttyACM0 "
Serial monitor opened successfully:

0000005144 [ncp.client] TRACE: Modem is not responsive @ 460800 baudrate

0000005145 [ncp.at] TRACE: > AT

0000006145 [ncp.at] TRACE: > AT

0000006328 [app] INFO: PROGRAM STARTUP

0000006337 [app] INFO: *** U-Blox START ***

0000006674 [app] INFO: USB Power ->1

0000007146 [ncp.at] TRACE: > AT

0000008147 [ncp.at] TRACE: > AT

0000009148 [ncp.at] TRACE: > AT

0000010149 [ncp.client] TRACE: Modem is not responsive @ 115200 baudrate

0000010150 [ncp.client] ERROR: No response from NCP

0000010150 [ncp.client] TRACE: Setting UART voltage translator state 0

0000010150 [ncp.client] TRACE: Hard resetting the modem

0000010350 [net.pppncp] TRACE: NCP event 3

0000010351 [net.pppncp] TRACE: NCP power state changed: IF_POWER_STATE_POWERING_UP

0000010351 [system.nm] TRACE: Interface 4 power state changed: 4

0000010352 [net.pppncp] ERROR: Failed to initialize cellular NCP client: -210

0000010452 [ncp.client] TRACE: Powering modem on, ncpId: 0x47

0000010453 [ncp.client] TRACE: Modem already on

0000010453 [net.pppncp] TRACE: NCP event 3

0000010453 [net.pppncp] TRACE: NCP power state changed: IF_POWER_STATE_UP

0000010454 [system.nm] TRACE: Interface 4 power state changed: 2

0000010454 [ncp.client] TRACE: Setting UART voltage translator state 1

0000010554 [ncp.client] TRACE: Setting UART voltage translator state 0

0000010654 [ncp.client] TRACE: Setting UART voltage translator state 1

0000011656 [ncp.at] TRACE: > AT

0000012657 [ncp.at] TRACE: > AT

0000013658 [ncp.at] TRACE: > AT

0000014659 [ncp.at] TRACE: > AT

0000015001 [app] INFO: NOT Connected! TowerLosses 1 (disc for 0.1 mins)

0000015660 [ncp.at] TRACE: > AT

0000016661 [ncp.at] TRACE: > AT

0000017662 [ncp.at] TRACE: > AT

0000018663 [ncp.at] TRACE: > AT

0000019664 [ncp.at] TRACE: > AT

0000020665 [ncp.at] TRACE: > AT

0000021667 [ncp.at] TRACE: > AT

0000022667 [ncp.at] TRACE: > AT

0000023668 [ncp.at] TRACE: > AT

0000024669 [ncp.at] TRACE: > AT

0000025670 [ncp.at] TRACE: > AT

0000026671 [ncp.client] ERROR: No response from NCP

0000026672 [ncp.client] TRACE: Setting UART voltage translator state 0

0000026672 [ncp.client] TRACE: Hard resetting the modem

0000026872 [net.pppncp] TRACE: NCP event 3

0000026873 [net.pppncp] TRACE: NCP power state changed: IF_POWER_STATE_POWERING_UP

0000026873 [system.nm] TRACE: Interface 4 power state changed: 4

0000026874 [net.pppncp] ERROR: Failed to initialize cellular NCP client: -210

0000026973 [ncp.client] TRACE: Powering modem on, ncpId: 0x47

0000026974 [ncp.client] TRACE: Modem already on

0000026974 [net.pppncp] TRACE: NCP event 3

0000026974 [net.pppncp] TRACE: NCP power state changed: IF_POWER_STATE_UP

0000026975 [system.nm] TRACE: Interface 4 power state changed: 2

0000026975 [ncp.client] TRACE: Setting UART voltage translator state 1

0000027075 [ncp.client] TRACE: Setting UART voltage translator state 0

0000027175 [ncp.client] TRACE: Setting UART voltage translator state 1

0000028176 [ncp.at] TRACE: > AT

0000029177 [ncp.at] TRACE: > AT

0000030002 [app] INFO: NOT Connected! TowerLosses 1 (disc for 0.4 mins)

0000030178 [ncp.at] TRACE: > AT

0000031179 [ncp.at] TRACE: > AT

0000032180 [ncp.at] TRACE: > AT

etc....


A full power off reset cured the problem - the Boron connected to a tower and cloud within 10 seconds.

Your previous reply states that I should wait 10 minutes after a cloud loss before attempting to reconnect. I use SYSTEM_THREAD(ENABLED); and
SYSTEM_MODE(AUTOMATIC); It automatically tries to reconnect immediately. Should I not use AUTOMATIC mode and take control myself to do that delay?

I've now implemented the following watchdog reset method instead of using System.Reset. I've tested it and it resets the Boron correctly, I have no way of reproducing the modem lockup problem previously seen.

Will this fix that lockup? (Trace states modem was hard reset...)

    Watchdog.init(WatchdogConfiguration().timeout(60s));
    Watchdog.start();
    SystemSleepConfiguration config;
    config.mode(SystemSleepMode::ULTRA_LOW_POWER).gpio(WKP, RISING);
    System.sleep(config);        

Regards,
Terje Nilsen

You should always use SEMI_AUTOMATIC or AUTOMATIC mode, and both work similarly. Some cases where you need to use SEMI_AUTOMATIC are:

  • You want to do wake cycles where you wake but do not connect to cellular
  • You have a situation such as battery and solar and need to be able to check the battery SoC to avoid trying to connect when the battery is very low

It's not entirely clear what happened that required power cycling that device. But if the reset button did not fix it, the built-in hardware watchdog will not fix it either. Also, the watchdog does not stop in sleep mode, so a 60 second watchdog will wake the device every 60 seconds.

Rick,

thanks for your reply.

The 60s watchdog timeout purpose is to perform a reset of the Boron after it has had 60 seconds of sleep in ULPower mode. The idea was to hard reset the modem.

I was afraid that the cause of my modem lockup is not easily determined. I've now added a device that cycles the power to the Boron once a day. This is the safest solution, as it is not important for my purpose to always be able to access the Boron, but I can't have it lock up for days.

By the way, in my current Boron app, once an hour I poll the modem for some parameters using a few AT commands via Cellular.command(). Is this the potential problem?

Regards,
Terje Nilsen

In that case you do not need the watchdog, simply by making the boron sleep with cellular OFF => the modem reset will happen.

and maybe send the boron to hibernation for 30 seconds, cellular OFF, so the modem gets reset as well, since MCU resets do not reset the modem as per this below:

cheers,