Not working after a crash reboot

I am working with an external device and occasionally something will go wrong and the device will crash and reboot.

The reboot looks good and the cyan light begins to “breathe” again but the device isn’t working.
The program doesn’t run and the serial connection doesn’t work.

Does anyone know if this is normal behaviour and if the device contains a watchdog timer which could be used to overcome this problem?

Thanks

Hi @ArthurGuy

To avoid having a big problem and a “bricked” core, the over-the-air update build software stores both a backup firmware image and a factory reset firmware image, so you can reload and not have problems. Sometimes when you have a panic (red light flashing) the core decides to run the backup firmware instead of the most recently loaded firmware. If you hit the reset button it should try to load your most recently flashed program.

I am sure that someone on Spark team can tell us exactly how it decides, but the process you described is normal in a crash situation.

1 Like

Some times that happened to me also…
But my spark cores work standalone… their are suppose to work without human intervention… so this situation where the spark decides to restart with the backup firmware is not working for me.

1 Like

Hi @omarojo

If you build locally and use dfu-util to load your firmware, you can have complete control over which firmware versions the core uses for all three possible slots. But from the webIDE, the Spark code plays it safe and gives you something they know will work in the backup and factory reset slots. If you have no good firmware, the over-the-air update cannot work, so they work hard to avoid that situation.

1 Like

I presume if I want to carry on using the web ide and yet build something reliable that doesn’t require human intervention I would need to use an external watchdog timer?
Something that watches a normally toggling output and then toggles the reset line if this stops?

Hi @ArthurGuy

There is a watch-dog timer that is being used but the core firmware must be deciding to not run your main firmware and run the backup instead. I have certainly seen a panic on top of panic that can cause this in low memory situations.

Obviously the best thing to do is to have as bug-free a program as possible(!), but there can still be things outside our control. Doing a lot of dynamic memory allocation like calling new or malloc or creating String objects and destroying them quickly could all lead to possible problems.

This thread has some info on external watch-dog timers:

Plus there are things like this:

Personally I would use say an NPN transistor to discharge the 220uF cap instead of waiting in software.

Any chance you can post your code so we can follow along?