How to notify application incase of an incoming OTA update?

Firmware updates to Product is still giving me grief. Mostly it works OK, but there are always a few units that take ages to upgrade. I released another update 24 hours ago and 3/20 units still have not got their new firmware. These devices wake up every 15 minutes to send data to the cloud. Some of the non-upgraded units are sitting in our office, with good solid wifi and internet, right beside units that completed upgrades almost immediately.

I’m capturing the update process like this:

void FW(system_event_t event, int mode)   
{
    switch (mode)
    {
        case firmware_update_begin:
        case firmware_update_progress:
        {
            flag_FW=true;
            break;
        }
        case firmware_update_complete:
        {
            flag_FW=false;
            tm_await_resp=Time.now(); 
            break;
            
        }
        case firmware_update_failed:
        {
            flag_FW=false;
        }
    }
}  

and it mostly seems to work, as I use the flag to stop the system sleeping and do nothing but Particle.process().

Very occasionally I see "SOS: stack overflow " on my dev unit when I try to flash it - but its fairly rare.

Could we please have a way to prioritize updates over everything else?

1 Like

Wondering if more is known about this issue. Like @twospoons and @dheerajdake I also run a product with 50 devices. My latest firmware version was released 5 days ago, and as of now about 20 of them still have not updated it yet (they connect once a day and stay connected for a fixed 60 seconds).

I can’t comment on direct OTA flashing as come online during the night, but product firmware updating is giving me a lot of issues for now.

From my experience 60 seconds is not enough time unless your firmware is very small. I suggest you capture the firmware update system event , as I did, and extend your “on” time when the firmware is coming in.

The firmware right now is about 40kB, but may continue to grow, so it’s big indeed…

I still don’t fully understand how you embedded your update process into your code, but I can simply extend the waiting period for it from 60 to 180 seconds…It usually happens for devices in specific locations so I’m guessing it’s a connectivity issue and they don’t get prioritized by their local cell tower.

The function FW gets triggered by system_event_t , then I check to see which event has occurred (theres a list in the documentation somewhere). If the system_event is a firmware update begin or progress I flag that to the rest of my code ( boolean flag_FW) , and I use that to extend my sleep timeout. So if flag_FW is true I wait until it is false before sleeping (with a longer timeout), otherwise I sleep using my normal criteria ( a bunch of other stuff has to happen in my code).
The sleep code is not shown.

@Vitesze You need to add this to the code to disable automatic updates and manually check for updates.

void setup(){
System.disableUpdates();
System.on(firmware_update_pending, otaHandler);
}

void loop(){

}

void otaHandler () {
// When there is an update available, this handler gets() called.
// Do what you want in here and then enable updates and the OTA will be enabled
System.enableUpdates();
}

3 Likes

Thanks, I will test this out this afternoon :slight_smile:

So firmware_update_pending always returns true if the Particle Cloud wants to attempt to upload firmware to the Electron? Is this usually an instant process (e.g. as soon as it connects to the Cloud), or do I need to add in some time for this as well? I need to make extremely energy-efficient devices, so every second I can put them to Sleep earlier counts.

Do you have any recommendations for code to continue if the firmware-download has been completed?

Edit: Does this also work for Product devices?

The above approach is useful if you have any time sensitive tasks which has to be completed before you begin your OTA update. I believe firmware_update_pending returns true if there is an firmware update available.

As of Mar 2017, @ScruffR confirmed that the API System.updatesPending() is of no use. It’s 2018!. If that API is implemented already, you can use it to check if there are any new firmware updates for your device. This may take a while depending on your internet connectivity speeds.

When you wake the MCU from sleep to check for firmware updates, cancel the sleep timer. Two things can happen at this point:

  1. Particle.connected() returns true.
  2. Particle.connected() returns false.

If it returns false, schedule your wake up timer and continue saving energy

If it returns true, you have 2 cases:
Give some timeout(I am not sure, you should know it better) for the firmware update process.

  1. Successful update
    Firmware update happens within the timeout and the device reboots.

  2. Unsuccessful update
    Firmware update takes longer than the defined timeout. In this case, you can reboot. Particle will function based on your old code.

If updates are important, you can use a firmware version in your code and store it in EEPROM. Upon reboot if the version in EEPROM doesn’t match the version in your application, you can force to check for an update.

I’m not sure for product devices. I have used Photon and P1. Not sure how it goes with electron.
Hope this helps.

Thanks
Dheeraj

Are there any possible implications that can occur in this case? I'm asking because I've been testing some solutions for a bit, and after a couple of OTA-attempts, one of my devices started acting very oddly - it runs the same firmware as 50 other sensors in the field (that run fine), but it remained stuck in the same switch state indefinitely. After I updated the firmware over serial, it ran the firmware like normal.

The only thing I can think of is that it must've started the download, but was cut off by the timeout, and was then somehow left with incomplete firmware to run

Sorry for the late reply. I did experienced this sometimes. The cli says that update has started but after update it runs the same firmware. As firmware update code is from particle, I’m not sure how to fix this.

An "update started" message doesn't tell anything about its chances to succeed.

The actual "update code" may be but your currently running code on the device is more likely the reason that prevents the update to actually take.

@ScruffR, I have a slightly different but related requirement. I have noticed an issue with auto OTA generated when product firmware is upgraded that the sleep and wake cycle my application code can be in gets stuck. Essentially the device wakes every X seconds and publishes an event to say “waiting for remote command to wake” it then loops for 15 seconds and if no “wake” command is received enters normal sleep again. Of course, if the OTA update starts as soon as the device is connected it is unlikely to have completed within 15 seconds and thus gets cut off by the sleep function and going offline. Before I read the above thread I had consulted the photon firmware reference and was going to check if (System.updatesPending()) then do not enter sleep and stay looping. However, I understand that this will not tell me if an OTA is in progress - is this correct?

Looking at the above - I would need to have a flag set by the otaHandler() which I could then use in the application thread to stop the application calling System.sleep()?

There are multiple events in relation to OTA updates to which you can subscribe, but in essence setting a flag and checking its state whenever sensible should help.

1 Like

System.updatesPending()

Just to double check this does not work at all?

Thanks for the pointer to the firmware_update system event.

[Edit] Just had another thought - is it possible to 'catch' a brownout event using this approach and close open files, etc.

Hello @armor,
It does work. You should set it up as follows:

void setup(){
System.disableUpdates();
System.on(firmware_update_pending, otaHandler);
}

void loop() {
}

void otaHandler () {
// Save information or finish tasks before updated and trigger update
System.enableUpdates();
}

Hope that helps.

Thanks
Dheeraj

Using the techniques described in this topic (How to force a handshake for OTA updates) to get a firmware update to happen I could never get the above disableUpdates, System.on(firmware_update_pending, ...), enableUpdates to work. I correctly get notified that updates are pending and re-enable them but they fail every time:

0000032633 [comm.protocol.handshake] INFO: Sending HELLO message
0000033248 [comm.protocol.handshake] INFO: Handshake completed
0000033250 [system] INFO: Send spark/device/claim/code event
0000033474 [system] INFO: Send spark/device/last_reset event
0000033693 [system] INFO: Send subscriptions
0000033908 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000033908 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000033909 [comm] INFO: Sending TIME request
0000034217 [comm.protocol] INFO: Sending 'M' describe message
0000034446 [comm.protocol] INFO: rcv'd message type=1
0000034446 [system] INFO: Cloud connected
0000034446 [app] INFO: sent 2 events
0000034508 [comm.protocol] INFO: rcv'd message type=13
0000034720 [app.state.ModemControl] INFO: Transition: connecting -> syncing
0000034761 [comm.protocol] INFO: rcv'd message type=13
0000034871 [comm.protocol] INFO: Sending 'S' describe message
0000035156 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000035165 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000035165 [comm.protocol] INFO: rcv'd message type=1
0000035228 [comm.protocol] INFO: Received TIME response: 1540409570
0000035229 [comm.protocol] INFO: rcv'd message type=12
0000035229 [app.syncTime] TRACE: timeSyncedLast: 35228ms
0000035231 [app.metricsPublisher] TRACE: Added new entry (now length 62 bytes):
1540409570,4.07,83.8,0,0.000530208,0.000125,0,0,0,0,0,0,0,0,0
0000035328 [comm.protocol] INFO: rcv'd message type=13
0000035631 [comm.protocol] INFO: Sending 'A' describe message
0000035863 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000035863 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000035864 [comm.protocol] INFO: rcv'd message type=1
0000035864 [app.metricsPublisher] TRACE: Returning next event of 1 entries (61 chars)
0000036513 [app.deferredUpdate] INFO: Firmware update pending!
0000036575 [comm.protocol] INFO: rcv'd message type=13
0000036575 [comm.protocol] INFO: rcv'd message type=5
0000036576 [app.metricsPublisher] INFO: Published 61 chars, 0 bytes remaining
0000036576 [app.deferredUpdate] INFO: Re-enabling firmware updates

My messages of Firmware update pending is receiving the pending update system event. And I then System.enableUpdates(). But then nothing else is ever received from the cloud and the update fails every time.

So to handle this, if I get stuck waiting for the update that’s never going to arrive I Particle.publish("spark/device/session/end", PRIVATE | WITH_ACK) (to force a handshake next reboot) and then System.reset(FRESH_BOOT_RESET_MAGIC). When it reboots, in my setup() code where I normally System.disableUpdates() I instead check for System.resetReasonData() == FRESH_BOOT_RESET_MAGIC and if so I don’t disable updates, connect to the cloud, and wait for the update to come. That seems to work.

Hopefully this will save someone else the somewhat painful journey I’ve just been on.

3 Likes

This sounds contradictory :confused:
Either you do get "nothing else" or your "update fails" but not both. The former excludes the latter IMO.

On the Electron after rcv'd message type=5 (which is the beginning of the update) I never receive another message. And that's leaving it waiting for 10 minutes until I timeout and reset.

However, in the cloud I get the "spark/flash/status", "failed" event. That is what I mean by the "update fails". I presume once the "failed" event happens the cloud doesn't try and send the update.

1 Like

Hi, I have the following code. From what I can tell, it works fine; devices download updates pretty consistently. One quirk I noticed however is that there’s usually a 20-30s delay between “spark/status” - auto-update, and “spark/status” - started (i.e. breathing cyan to blinking magenta). I don’t really understand where this delay comes from - is it normal for it to happen?

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);

void setup(){
System.disableUpdates();
System.on(firmware_update_pending, otaHandler);
}

void loop() {
}

void otaHandler () {
    System.enableUpdates();
    for (uint32_t ms = millis(); millis() - ms < 120000; Particle.process());
    }

You will notice that same delay when calling particle flash with the CLI sometimes. It’s definitely annoying because it lacks clarity on what is happening and why, but it is a normal occurrence in my experience.

As far as I can tell there is some kind of non-essential function call within their API that times out sometimes, delaying the start of the flash. Would love to have that either explained or addressed though…

1 Like