I'm hoping somebody comment on or provide insight about the possible/likely outcomes if a device were to be reset while performing a Device OS OTA update.
I have some Boron devices that are currently in the field and and I would like to upgrade them to 4.2.0 (mostly for this bugfix).
The (potentially) spicy bit: The devices have hardware watchdogs which will toggle the RESET line if loop()
doesn't complete at least once every 30s AND... the devices are expensive to access (i.e. cross country flights, lodging, engineer's time, etc.). Thus, bricking isn't a good option.
My application is compiled with SYSTEM_THREAD(ENABLED)
, but my understanding is that some/all of the Device OS update occurs in Safe Mode.
My assumption is that Safe Mode doesn't run my loop()
, but I haven't found much documentation on Safe Mode to confirm this. Is this an accurate assumption?
So, if my 30s countdown timer starts ticking as soon as loop()
is no longer being called, what's the (rough/handwavy, I understand this could be difficult to put a figure on) probability of the device becoming bricked if RESET is toggled? I hope, for example, that system and user updates have a "double-buffer-like" arrangement where the critical operation(s) are actually quite small and fast, but I haven't looked under the hood to see how it's done. But I would also hazard a guess that each device device doesn't have two full sets of Device OS + user application, and that the bootloader isn't smart enough to fall back to a known good combination after failing N times if one set were to be corrupted... but gosh, would I love to be wrong!
Thanks.