OS 4.0.1 Overriding Nordic Hardware Watchdog?

Hey All,

I’ve got an issue with a remote deployed boron system running OS 4.0.1 (Australian Ag-Tech level remote), where the Boron seems to enter some sort of dead-lock condition. The Cyan LED stops breathing, and just locks at a solid intensity, and everything just goes completely unresponsive.

Since the lock-up seems to be pretty low level, I don’t trust the OS-level watchdog, since it looks like the OS-level stuff is also getting locked out if the status LED stops breathing. For better reliability I’m trying to make use of the On-board watchdog in the NRF with the following method:

/**
 * @brief Arm the NRF internal watchdog timer with the specified timeout.
 * 
 * @param timeout Desired timeout value in seconds used to calculate the setting for the CRV register - defaults to 10 minutes
 * @param kickRegCount Number of kick registers to activate - defaults to 1.
 * 
 * From the moment this function is called, it is required that all of the armed WDT registers get poked throught the kickNordicWatchdog function or
 * a hardware reset will be generated.
 * 
 * The CRV value is calculated according to the following formula.
 * timeout [s] = ( CRV + 1 ) / 32768
 *    CRV      = (WTD_TIME * 32768) - 1
 */
void armNordicWatchdog(unsigned int timeout, unsigned int kickRegCount){
  uint32_t crv_target_value = (timeout * 32768) - 1;
  uint32_t kick_reg_mask = (0x01 << (kickRegCount+1)) - 1; // This will generate a bitmask with 1's in the lowest order bits

  NRF_WDT->CONFIG       = 0x09;     // Configure WDT to run when CPU is asleep, and when 'debugger paused' as-well
  Log.trace("Arming WDT - Timeout 0x%x", crv_target_value);
  NRF_WDT->CRV          = crv_target_value;
  NRF_WDT->RREN         = kick_reg_mask;     // Enable the RR[0] reload register
  NRF_WDT->TASKS_START  = 1;        // Start WDT
}

I’m trying to test this, by running this method in my setup() call, and then just going into my usual behaviour and waiting for the watchdog reset to happen. The problem is that I’m just not seeing the watchdog reset at all. Is there something I’m missing here?

As an aside, It’s on a long timeout (10 minutes), and it’s getting kicked every time through loop(), so it won’t interfere with cloud firmware updates.

Does anyone know about a conflict here? I can’t find anything in the docs that specifically say I’m not allowed to play with this module.

We do not recommend using the nRF52840 hardware watchdog. There’s code to use it in Device OS, though it’s not currently turned on. But it could be in future versions of Device OS.

The most common cause of deadlock is accessing a mutex-protected resource within a SINGLE_THREADED_BLOCK. This includes things like SPI, I2C, and other hardware peripherals. It also includes Log.info, etc…

The application watchdog is not particularly useful because it also stops running on deadlock.

Hi Rick,

Thanks for the response. I’m not explicitly using ‘SINGLE_THREADED_BLOCK’ anywhere in my code, but it is possible I’ll need to check some of the device driver libs I’m importing for that.

As you say, the application watchdog is not useful on the sort of thing I’m trying to catch, is there any other recommendations you guys have on this? A field-service trip for me is typically a 3-day affair and costs several thousand dollars in flights and man-hours, so I kind of need a back-stop like a reliable watchdog.

Couple of follow-up questions as-well:

Why don’t you recommend it, what’s the failure mode? I’d like to make an informed decision about the risk profile here, as we are already doing a couple of things with higher risk levels than some people would be happy with.

Secondly, is there any reason you can see why the code snippet I’ve shared above may not be performing as expected? I haven’t tried to do direct register manipulation on the Nordic before, so I may be missing something silly?

– DKW

We recommend an external hardware watchdog instead of using the one in the MCU.

If you use the MCU one, you’ll also need to make sure you disable yours before upgrading to a version of Device OS that enables it, since it’s possible that the two could conflict.

You can enable nRF52 registers from user code. You might want to try using the nrfx sdk wrapper. That’s what Device OS uses, and you can use many of the nRF52 SDK features from user code.

The tricky part is getting interrupts across the Device OS to user space boundary, but as long as you don’t enable the watchdog event you don’t have to worry about that.

It should work, and it’s not obvious to me from the code why it doesn’t work.

This library uses the nRF52 SDK from a user application. Obviously it’s using the ADC not the WDC, but it shows how you can use the SDK.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.