I'm trying to test the application watchdog

Seems simple enough:

ApplicationWatchdog wd(60000, System.reset);

But to test it, I’m doing this inside the loop() – like 15 seconds after boot:

while (1==1) {
}

Doesn’t that “starve” the loop and cause a reboot like in 60 seconds? I’ve turned off System Thread just in case.

Sort of related, when it downloads new firmware, I get a reset reason of 140 (user reason) instead of 70 (successful firmware update). What am I missing?

Thanks!
Tahl

@Tahl, I just tried a very simple app and had the same problem so I can confirm the issue.

ApplicationWatchdog wd(60000, System.reset);


void setup() {

}

void loop() {

    while(1) Particle.process();

}

However, I can’t confirm the OTA reset reason issue.

@rickkas7 can you please confirm?

@peekay123, Actually your code looks it would keep things alive and not trigger the watchdog if I understand it. If you did a tight loop with any BUT Particle.process(), the watchdog should step in since the loop() is not looping.

Tahl

1 Like

@Tahl, Particle.process() should NOT keep the watchdog going. Removing it, however, will cause the Cloud connection to drop since the background process is not allowed to run. What I see is the Photon comes online (breathing cyan) then goes to breathing green after about 10 seconds. After almost 2 mins, the Photon seems to reset and start over.

I will try with SYSTEM_THREAD(ENABLED).

1 Like

@Tahl, I ran with SYSTEM_THREAD(ENABLED), with and without Particle.process(). Without, the Photon resets after about 2 minutes, showing spark/device/last_resetuser in the console. After that, the Photon reset every minute, as expected.

With Particle.process(), the Photon doesn’t reset which is perplexing since that should not reset the watchdog.

1 Like

I use the Watchdog in some Electron testing I have been doing and I planned on adding it to some code I’m using to test the RFM95 radio modules.

What exactly have you figured out here?

I’m a little confused and just want to know conditions trigger the watchdog to not work.

Is this only with the new 6.2 firmware?

I don’t want to rely on something that is not working when I just expect it to do it’s job.

The docs state this tho'

But it might be that the forceful nature of SYSTEM_MODE(AUTOMATIC) might play a role here.
Try with SYSTEM_MODE(SEMI_AUTOMATIC)

BTW, I agree that the implicit ApplicationWatchdog.checkin() in Particle.process() limits the use of the watchdog as you have no control in your code whether you want to spin a loop keeping the cloud connected but still bail out if the loop takes too long.
One proposal to overcome this would be a void CloudClass::process(bool wdCheckin = true) to get control whether or not you want to reset the watchdog (@rickkas7?)

2 Likes

@ScruffR, I’d missed that, partially because I didn’t expect it! An APPLICATION watchdog should only be kept alive by the application, not the system firmware. Call Particle.process() should not, in any way, update the watchdog IMO. The checkIn() call should unequivocally and explicitly be called in the used app. Should we post this as an issue?

4 Likes

@peekay123 and @ScruffR,

I’m a bit confused on the Particle.process() bit. I never call Particle.process() as the equivalent functionality is performed already. Here’s what the doc says:

“Particle.process() is called automatically after every loop() and during delays. Typically you will not need to call Particle.process() unless you block in some other way and need to maintain the connection to the Cloud, or you change the system mode.”

So in order to test the watchdog, I should only have to block on something within the loop() such as with while (1) { } and the watchdog should kick in.

Btw, I SYSTEM_MODE(SEMI_AUTOMATIC) and SYSTEM_THREAD(ENABLED) and running 6.2.

Assuming we can’t get this to work, any advice / suggestions on using a hardware watchdog a la SPARKINTERVALTIMER? In lieu of or in addition to?

1 Like

@Tahl, in my test where I did NOT call Particle.process() in loop(), the watchdog did reset the processor. Oddly on reboot, it took 2 minutes but subsequently, it rebooted every minute.

Yup, I’d also think the current behaviour is rather counterproductive.
I could live with the implicit checkin when dropping out of loop() as this inevitably means the application does not block, but Particle.process() should definetly not or at least allow to opt-out to avoid breaking existing code.

I got it working. I don’t call Particle.process() of course.

Suggest using hardware watchdog too (SPARKINTERVALTIMER)? Or is that overkill?

@Tahl, any timer based on FreeRTOS or an STM32 hardware timer is subject to hanging if code goes bad. A true external watchdog chip is the most reliable approach. However, the ApplicationWatchdog will work fine for 95% or more of cases.

@peekay123 Exactly, the AppWD() function and performance is not clear and should not be kept alive by anything else besides ones own code. Hence I rolled my own watchdog routines.

2 Likes

@joost, can you share?

1 Like

As said above also, .checkin() is implicitly called in particle.process() this should:
a) not happen
b) be documented

There is a lot of little stuff like this throughout the particle platform that cause much pain and gnashing of teeth. The WD issues I ran into not too long ago since I have problems with the WiFi (as to others). Resetting the unit is needed under certain circumstances, one is if ones code has gone amok. I banged my had on the WD not understanding why it never got triggered when I created a problem scenario…

frustrating stuff

1 Like

It is as stated above.

I’d be very interested looking at your implementetions @joost ! I’m currently running an application WD but as I’m experiencing complete bricks (complete freeze of the system, status LED goes solid green/cyan/off) the WD aren’t able to catch this. I’m thinking using a hardware WD (based on the independent STM32 watchdog I guess?) should catch a complete brick like that, right?

I’m looking into using @peekay123’s SparkTimerIntervall for this purpose. Because as far as I understand it, if based on the STM32 independent WD, should then be able to catch a complete brick of the system?

1 Like

@ftideman, SparkIntervalTimer uses timer hardware interrupts, not. The STM32 watchdog which I believe is disabled due to FreeRTOS.

1 Like

Ah, so there’s no way of exposing / getting access to the STM32’s watchdog?

I’m still not sure how a WD implementation based on hardware timers, and the application watchdog, differs. Will the former be able to run even though you get a complete brick (RTOS crash?)?