Electron (v6.2): flaky OTA FW Update

Right now the OTA FW update is very unpredictable. Sometimes it works, sometimes it doesn’t.

NOTE: running version 6.2.

I read on many postings that it has to do with the ‘handshake’ with the cloud SW which is not very reliable.

Right now, the only way we can make sure the latest FW is upgraded to a remove device, is by physically pushing the RESET button and even still, sometime it takes several attempts to get it done.

Is there a better and more predictable way to push this FW over the air instead of waiting for some sort of unpredictable handshake event to occur ?

@philstrick is the device accessible?

yes of course, or how would I expect to get an OTA FW update…

Oops. I meant physically accessible.

If that’s the case, can you place it in safe mode and perform an OTA? If it works well then the user firmware might be causing the flaky issue

oh ok :slight_smile:

Yes, thankfully some of my system integration and test units are physically accessible in our lab and when put in safe mode, the FW gets updated, and also using our own FW, That’s not the issue we need to figure out.

The problem is that it is not reliable; some time it works, some times it does not. Some unit get updated some don’t. That’s the real issue I’d like to resolve here.

In short my question is; is there a procedure to remotely force an update without having to wait for a (random) handshake and never know when the update with occur

That’s what I’d like to understand.

My point is to test whether the “some times it works, some times it doesn’t” issue is due to firmware by having it run in Safe mode or using the tinker app and flashing a few times to see the behavior.

1 Like

We are off track. I’m not talking about the process to manually flash the FW. That works just fine and I can force an OTA FW upgrade by manually pushing the reset button of the units and waiting for the new FW to be flashed.

What doesn’t work is the ‘automated’ procedure that is controlled by particle.io.

There needs to be a better procedure which would allow the operator to upgrade the units with the new version, when we chose to do so instead of relying on some sort of third party algorithm which appears to be unpredictable.

I might have figured out a way to force a FW rollout manually. A bit hairy but that’s all I could find at my disposal.

The solution is to go into the ‘edit’ mode, and force a ‘locked’ with the ‘flash now’ option checked.

Once done, the device can be unlocked and kept in the new FW version. Life is good!.. well, supposed to be, but as pointed out earlier, that process is not reliable and we can get into a situation where the device is no longer communicating with the cloud even though everything seems normal and the device is breathing light blue.

We had to physically go out there and push the RESET button to recover that unit. That’s really not good to potentially lose communication with an endpoint in production mode.

That’s a 80% success ratio and obviously not acceptable for production.