OTA Update Fails on Photon when Hardware is Connected

I have a hardware device that runs on the Photon, and OTA updates fail everytime when I have the Photon plugged in to the hardware it runs. When I remove the Photon from the board, it updates fine (n = 3). This is an issue because (obviously) I would like to update the firmware from afar. How can I debug this? Why would this happen?

These devices are running in India, and I am based in Boston, so physically making changes is kind of an issue…

No hash or anything is ever generated, as can be seen in the logs:

The only anomaly I have noticed is that in the logs (spark/flash/status), the data reports the correct product_id, but the incorrect (new) version_id. Is that correct? Is it supposed to report the old product_id (currently on the photon)?

You’d need to be more specific than “a hardware device” for us to get a litle chance to guess a cause.

If you have a Photon controlled Tesla Coil for instance, I’d not be surprised that you have no WiFi :wink:

Also showing some of your code (maybe via PM to keep it private) might help.
If you detect the presence of the hardware and run some code only when present, that would be the place to look at.
Keeping the Photon occupied for several seconds does interfere with OTA but not enough to actually kill the cloud connection.

1 Like

Fair. Though wi-fi is not the problem, as it works perfectly otherwise…

It’s connected to several analog sensors as well as a couple of mechanical relays and a uSD card. Pretty simple layout that works great with OTA updates while hooked up being the only issue.

With you mentioning products, I can imagine it might be locked in to a certain version on the console. Could that be the case?

@Moors7 I don’t think so? I am attempting (initially) to do the updates through the console by publishing a new firmware release and then switching the Firmware Version on an individual device and hitting ‘Lock and Flash Now’. Also of note is that a popup says that it was successful, but then if you look at the logs, you see it continuously fails.

To better answer @ScruffR 's question, the device always expects the hardware to be plugged in (as it always is) and is running continuously (They’re environmental monitors).

What I meant with this

is that I suspect your code to delay the background task too much to perform a successful OTA flash (or even interfere with it).
But only in its intended environment (plugged in) while your code will not perform the delaying/interfering tasks when not plugged in.

Hmm. okay.

So does the OTA update on the Photon/Electron not take precedence when in progress? I assumed it would complete the firmware update before going back to the program? I also have threading enabled which might cause issues?

As of now, it is pretty much continuously reading from sensors (1 Hz) via i2c and SPI and then dumping to the server every 30 s or so. Is there a best practice for performing OTA updates? Should I just write a System Event handler to shut off the sensors while doing performing a firmware upgrade?

That would be the best.

OTA does not immediately interrupt your code especially not with threading enabled. You could even switch WiFi off or send the device to sleep after magenta flashing has started, interrupting the ongoing update.

One other thing you have to be aware with threading is that your original code keeps running for a few seconds after the magenta flashing has gone back to cyan breathing and only then resets to activate your new code. If you interfere with the process there in between, the OTA often doesn’t stick.

Do you have any estimation how long it may take for the OTA to “get registered” by the Electron? I have some devices that connect for a minute or two at midnight, and for a few seconds once per hour. I notice that my OTA update has about a 50% success rate at midnight, but almost 0% at any other point during the day. In this case in the log I see the OTA fails only about 1 minute in, even though they’re set to wait for 3 minutes if a pending firmware update is registered.

I’m wondering how long I should let my Electrons remain connected throughout the day for the OTA update to stick. Right now it’s 5 seconds, and they seem to disconnect/sleep before the pending update can trigger the otaHandler.

If the device wakes for that time it doesn’t necessarily mean it’s actually connected for the entirety of that time.

I have no hard facts to back the claim, but I’d say you should stay connected for at least 30sec after you have established a stable cloud connection for the update to be noticed by the device and once you got the notice give it at least extra five minutes - with flaky reception some chunks may need to be resent (even multiple times).
However once an update is actually pulled in these five minutes are merely academic since an update would eventually end with a reset once its done anyway - the device shouldn’t stay on any longer than needed.

1 Like

Shouldn’t it usually remain connected though, since the connection remains valid for 23 minutes (I believe) ?

Case 1 (once per hour) - Electron connects. Publishes data. 5-second delay before Cellular.off()
Case 2 (once per day) - Electron connects. Spends up to 1.5 minutes obtaining a GPS-fix. Publishes data, 5-second delay before Cellular.off()

30sec is a lot since it needs to be a low-power device, so I guess I will depend on the longer connection at midnight for the OTA update to kick in. It has been working so far, I was just wondering if there was anything I could do to get the OTA updates hourly without sacrificing a lot on power consumption

Are you using the SLEEP_NETWORK_STANDBY flag in your sleep call?

However, when you call Cellular.off() how would the connection be kept open?

Are you intending to roll-out firmware updates that frequently or is it just the convenicence that you wouldn’t have to wait for a day to go by before you find out whether or not all your devices had the chance to pull in the update?

I’m using Deep Sleep mode.

Sorry, what I mean is that as long as Particle.connected() returns True, and I don’t do anything in particular to make it disconnect etc., it should keep the connection open?

The hourly check-in is not going to work as I can’t afford keeping the device idle for more than a few seconds. For the daily check-in however, given that it attempts to obtain a GPS fix for a minute (or two), this connection should usually be of satisfactory duration for the OTA to kick in?

EDIT: The GPS fix is done with the Google API so it does need to stay connected regardless for this.

And? The SLEEP_NETWORK_STANDBY also - if not foremost - applies to deep sleep.

Oh no - I most definitely am not using SLEEP_NETWORK_STANDBY as the cellular drains far too much power. I apply control over the cellular throughout my code - I use Semi-Automatic mode, I only enable the cellular when a connection needs to be made, and right after publishing the data the cellular is disabled again.

Anyway your answer was clear, thanks - I will settle for OTA updates only during the daily check-in. 5 seconds is clearly far too little for the OTA to stick judging by your answer.

1 Like

From when to when do you measure the “a minute (or two)”?
In order to get an aGPS “fix” you only need cell connection (Cellular.ready()) but this doesn’t necessarily mean you have a cloud connection.
However, if you are using webhooks, you can be certain to have a valid cloud connection, but then a webhook request may not really take a minute to actually give you a fix but may turn round considerably quicker.

So you need to ensure the minimum wake time by other means, I guess.

After Particle.connected() returns True, it first determines its cell tower info (which can take a while) and then send this information to the webhook every 20 seconds… I’m not 100% sure if it actually remains connected to the cloud throughout the entire process though.

Thanks - 5 minutes seems to be about the sweet spot here. 3 minutes seemed too little for the OTA to stick (approx. 50% success rate) but with 5 minutes all devices were updated correctly. File size is 33KB - I’m assuming the time necessary for this scales with file size.

1 Like