I am guessing that if you have hundreds of devices you have a product(s) and can use the console to manage firmware OTA updates?
@armor Yes, I do use that. In this case I am testing pre-production firmware that I have not yet released to my product formally, but the flashing process would otherwise be identical to if I changed firmware versions via the product. Unfortunately this isn't the kind of issue that can be fixed by monitoring properly - it either will succeed (99.9% of the time) or fail catastrophically (in this particular scenario).
The keepAlive is also an issue. However, in theory during a safe mode OTA flash it should all happen quickly, so even a 30 second keep alive shouldn’t be a problem.
Also in theory during OTA because there is a constant flow of data back and forth the keepAlive isn't required to keep the UDP connection alive since the normal data flow is already fulfilling that function, I believe.
The problem with 3rd-party SIM cards is that if this process get interrupted such that the modem is completely powered off, then the APN settings will be lost.
Gotcha! This at least makes sense to me. I use a hardware watchdog that actually triggers a power reset of my entire hardware (I have some other things devices powered by a custom POE by my board). When I watched this happen, I think for some reason that may have been triggered during the update.
I may be able to find a safe way to kick my watchdog (It's a 10minute timer, so normally shouldn't be an issue for OTA) upon successful completion of an OTA update, but before reset. This should give me a statistically reasonable chance to avoid impact from the OTA update.
I think that properly addresses the meat of my question (the mitigation is to avoid power resets during System Firmware OTA like the devil), so I'll mark that as the solution, at least for now. Thanks as always, @rickkas7!
I’ve dealt with this issue in the past, and I’ve used a custom compiled device-os binary to get around this.
Thanks @hwestbrook! Appreciate hearing your experience. This was something I also considered doing. I would be comfortable doing that in theory, but feel like it may be overkill if I can mostly mitigate the issue by better managing my watchdog. I generally am very conservative when it comes to deviceOS updates, so I could probably stomach it, but sounds difficult to scale with.
I use a lot of data per month (250-750 MB), so what I've had communicated to me so far is that there will not be a possibility for me to have a similar deal in the foreseeable future.
One more thought on this – I’d be interested to know if Particle would accept pull requests for IMSI based APN settings? This would be one step in the direction of making 3rd party sims easier and more reliable.
Instead of a PR, wouldn't it be easier to provide a way to manage APN settings like Photons do with WiFi credentials? Have some permanent defaults in System Firmware Program Flash, but also have a special spot in flash where an APN setting can be stored that can be loaded in a way where it is simply added to the available options in the System Firmware, if such a setting is set and readable from flash. The normal APN setting in user firmware would then still work as normal.