Watchdog for Spark firmware

I would like to ensure the STM cpu code is running without crashing (halting), since I am using it to turn on-off water base on timer schedule.

Is there a cpu watchdog available to be used to ensure the “loop()” function is being called in regular basis. In case of timeout, Spark Core could reset.

For instance, if i put the “dead loop call” in loop(), will this able to trigger a Reset() ?

3 Likes

There is watchdog turned on by default on the core but not the photon if i remembered correctly.

Ping @mdma

That’s correct kenneth. Although on both platforms even with the watchdog timer enabled, it is reset every 1ms via the systick interrupt. I feel this is something the user firmware should be able to control, and have added this to our backlog. cc: @satishgn

@mdma is this still on the backlog? Or was there a watchdog feature exposed?

@Dilbert did you ever devise a solution?

I too am running a system controlling water so some kind of watchdog is quite important for me.

You could implement the watchdog yourself - have each call to loop save the current time to a global variable. Set up a timer that checks this global variable periodically and if it sees the loop isn’t running, perform a system reset.

We avoided the hardware watchdog because it caused more problems that it solved.

1 Like

I only able to implement a Application level Watchdog (using Matthew method)… But this is a huge gap for this solution. This making the assumption that under level SW does not have problem.

But this is not the case, I had discovered a crash problem while multiple Cloud comments from HTML program could trigger a lockup in the core SW. I had reported this last year but they acknowledged it but did not have solution for me. This was in the Spark Core platform. Had not tried Photon yet with same SW.

Just last week when I empty my Sprinkler System (winter coming), I have the crash, the water valve stay opened without watchdog protection. Since I did not put this project into production, I did not go further with a better solution for this issue.

@Dilbert, you could use SparkIntervalTimer to create a hardware timer interrupt that would call System.reset() if the timer is not reset regularly. The timer interrupt will fire even if the rest of the system is “locked”. I haven’t tried this but if you do, I would love to hear how it went!

Is the “SparkIntervalTimer” standard Spark API ? I could not find it under regular doc. Yes, a system timer interrupt will able to do the job. I just did not aware it is available to be used :frowning:

More info where to find the API will be appreciated :smile:

@Dilbert, it’s a library available in the web IDE :wink:

Thanks, any reason why this basic feature not embedded to be standard library for use ??
Especially, it is using the HW interrupt callback functions.
Even though some would say these functions are dangerous to be used…

Huh!! Is the "some" like the famous "they"? The library was created to allow users to conveniently access timer resources for specific tasks (repetitive interrupts). When I get a chance, I do intend to create a generalized library that could be embedded into the system firmware eventually. Any interrupt service routine, regardless of its origin, needs careful planning and design so as not to unbalance the system it runs within (eg. no Serial.print()). They are not "dangerous". :wink:

May be u mis-understand my statement a bit. I do have experience with interrupt callback in my past projects in production. So I am fully aware of the consequences of interrupt use. Thanks for the reminder.

I am a just a bit confused with the support model of the public libraries. My understanding is, public library is “use on your own risk” approach and not being fully supported by Particle FW team member. It is all based on individual to provide input to support. May be I am wrong, please correct me with info :smile:

Hi @Dilbert

Maybe the misunderstanding is that @peekay123 does not work for Particle–he is a volunteer who contributes in a variety of ways including libraries, doc, firmware in the product, etc.

So he is supporting his public library as you said. Someday it could happen that Particle decides to incorporate a library like his into the built-in firmware but that is not certain to happen.

2 Likes

:smiley:

The concept of libraries are here to stay - we don’t want to bloat out the system firmware with every useful feature, but instead provide the foundation so that library writers and app developers can build the exciting things they want to.

The SparkIntervalTimer library is heavily used and tested, and one of the most respected libraries out there.

The library gets a :+1: from me. I’d give it the esteemed “mdma badge of approval” just as soon as it gets a suite of automated tests… :smile:

3 Likes

I used the suggested method, Interval Timer library as watchdog and perform System.reset().
However, I noticed that it will be triggered while I am performing code download.

Is there a way to disable the timer while download is started ?
Or a flag to check to know the state, so that I could disable the watchdog ?

How does the ApplicationWatchdog (https://github.com/spark/firmware/pull/860) in Firmware 0.5
to the WatchDog library (which uses the SparkIntervalTimer library)?
http://stasheddragon.com/2015/watchdog-library-for-particle-photon/

Is it something that is embedded now, or is it a completely different feature? They use a window watchdog and an independent Watchdog.

Also I’m curious what the stacksize parameter exactly does in the Watchdog:
ApplicationWatchdog wd(timeout_milli_seconds, timeout_function_to_call, stack_size=512);

I have the same doubt as kasper.

The Application Watchdog that’s built in uses a FreeRTOS thread to run independently from your application code (and the system code, if system threading is enabled).

https://docs.particle.io/reference/firmware/photon/#application-watchdog

The main advantage of this is that the timeout function to call is called from a thread, not an interrupt service routine. This allows a slightly greater choice of options of what you can call before, presumably, doing a System.reset(). It’s the set of operations you can perform from a software timer (which also run in a thread, but a different one). Also, it’s very efficient, not using any hardware timer resources and using almost no CPU when not triggered.

The stack size parameter is used because it’s a required parameter for any thread. You want it to be small to avoid using too much RAM, but it has to be big enough for the code you intend to run in your callback function, which is why it’s configurable.

2 Likes

Still some questions pop-up.

The documentation says:

Enabling the hardware watchdog in combination with this is recommended, so that the system resets in the event that interrupts are not firing.

How do I enable the hardware watchdog? Is that something implemented in the WatchDog library (and which one it is WWGD or IWGD). From there forum post I understand that both of them are Hardware watchdogs:

So is my conclusion right that it's best to implement the two library WatchDogs and the Application WatchDog, because they all "watch" something else? Or would that be an overkill?