How to guarantee an OTA check before shutdown?

electron
Tags: #<Tag:0x00007fe21fdaff08>

#1

I am having an issue in a battery-operated Electron system where I am starting, doing a few things, reporting to the cloud, then powering down again.

I have an event attached to firmware_update that is supposed to notify me to delay the shutdown while a firmware update occurs, but I am finding that it does not trigger until after the device is already shutting down!

Is there a way to move this check sooner in the process or is there a (hidden) system variable I can check to say whether the firmware version check has been made? Leaving the device on and just waiting around for a thing to happen (maybe) isn’t good coding practice.

System thread is enabled, I am quite cautious to insert Particle.process() in anything that blocks, and I have also refactored the loop() into a state machine so that it can fall-through a number of times during execution. None of these things seems to affect the timing of the firmware_update event.

(NB4: System.updatesPending() is probably not going to be useful since [its documented function] is to find whether an update has been rejected during a period of System.disableUpdates(). Not an enterprise customer, so this is irrelevant anyway.)


#2

I appreciate the documentation isn’t as clear as could be on this area. Have you tried checking for reset and resetpending?

Something like this:

void handle_restart_events(system_event_t event, int data)
{
    if (event == reset)
    {
        isResetPending = false;
    }
    else if (event == reset_pending)
    {
        System.enableReset();
    }
    else if (event == firmware_update && data == firmware_update_complete)
    {
        isResetPending = true;
    }
    else if (event == firmware_update && data == firmware_update_begin)
    {
        //Software Update In Progress - set a flag to ensure device does not shutdown
    }
}

In my applications I have run states (standby) where software updates are permitted - I ensure that updates are enabled and disabled accordingly going into and out of standby. The isResetPending flag is used to ensure SPI bus access isn’t started when a reset could happen in the middle of a transaction.


#3

I added this code to test

System.on(reset_pending | reset, handle_restart_events); // edited per below
...
void handle_restart_events(system_event_t event, int data) {
  appLog.info("Restart event occurred!!!");
  if (appLog.isInfoEnabled())  delay(3000);
}

but it this message does not appear in the log, probably because System.sleep(SLEEP_MODE_DEEP, long seconds); is what is initiating the disconnection, rather than some soft reset.


#4

I thought the event flags had to be added together - like this?
System.on(reset+reset_pending+firmware_update, handle_restart_events); //register event handler
Maybe that explains why there was no output?


#5

ORing is the more professional and safer way to do it - but it should actually be the binary OR | not logical ||.

These “flags” may not always be single-bit flags but actually combinations of finer granular flags.
While ORing a combinatory flag of 0b1001 | 0b0011 = 0b1011 (both share the least significant bit set) will render all bits of both set, adding them will not 9 + 3 = 12 (0b1100).


#6

@ScruffR Oops nice catch, will retry


#7

Thanks for being corrected - you are 100% correct. I am of course following the example in the reference documentation and in any case the event values are all 2 power n so no two events share use of the same bit. Use of || rather than | is the real issue here!

    // listen for Wi-Fi Listen events and Firmware Update events
    System.on(wifi_listen+firmware_update, handle_all_the_events);

#8

No change


#9

Have you tried the code snippet I posted? The whole thing is a bit more subtle than it seems and I received some help from Rick. I think the problem was that once the event reset is recognised then there is no time to do serial output.

BTW, isResetPending = false; should be set after the System.on();


#10

I already have this code in use, which is supposed to block the turn-off from occurring. The problem is that the firmware_update event doesn’t come in until AFTER the shutdown has begun.

I am checking for it before shutting down.

void firmware_update_handler(system_event_t event, int status) {
  switch(status) {
    case firmware_update_begin:
      gFirmwareUpdateStatus = FW_UPDATING;
      appLog.info("Firmware update beginning event was triggered.");
      break;
    case firmware_update_progress:
      gFirmwareUpdateStatus = FW_UPDATING;
      appLog.info("Firmware update progress event was triggered.");
      break;
    case firmware_update_complete:
      gFirmwareUpdateStatus = FW_DONE;
      appLog.info("Firmware update complete event was triggered.");
      break;
    case firmware_update_failed:
      gFirmwareUpdateStatus = FW_DONE;
      appLog.error("Firmware update failed event was triggered.");
      break;
    default:
      appLog.warn("Unknown firmware_update_handler status %i", status);
      break;
  }
}
  0000016685 [app] TRACE: loop() gLoopState=ST_TURNING_OFF
  0000016690 [app] TRACE: turnOff()
* 0000016693 [app] INFO: Firmware wait/check started with status FW_NONE
* 0000016700 [app] INFO: Firmware wait/check done with status FW_NONE
  0000016696 [comm.protocol] INFO: rcv'd message type=13
  0000016832 [comm.protocol] INFO: rcv'd message type=13
  0000016961 [comm.protocol] INFO: Received TIME response: 1570750344
  0000016961 [comm.protocol] INFO: rcv'd message type=12
  0000017088 [comm.protocol] INFO: rcv'd message type=13
  0000017225 [comm.protocol] INFO: Sending 'S' describe message
  0000017431 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
  0000017440 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
  0000017440 [comm.protocol] INFO: rcv'd message type=1
  0000017795 [comm.protocol] INFO: Sending 'A' describe message
  0000017941 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
  0000017941 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
  0000017947 [comm.protocol] INFO: rcv'd message type=1
  0000020706 [app] INFO: Turning off after a 3 second log flush delay--MARK.
* 0000020712 [app] INFO: Firmware update beginning event was triggered.

Versus if I slap a 10 second delay in front of the update check before turnoff (a highly undesirable thing), this happens:

* 0000029426 [app] INFO: Firmware wait/check started with status FW_UPDATING
  0000029427 [app] INFO: Firmware update progress event was triggered.
  0000029433 [app] INFO: Firmware update progress event was triggered.
  ...
  0000049053 [comm] INFO: Update done 44
  0000049263 [comm] INFO: Waiting for Confirmed messages to be sent.
  0000049264 [app] INFO: Firmware update complete event was triggered.
* 0000049270 [app] INFO: Firmware wait/check done with status FW_DONE
  0000050269 [comm] INFO: All Confirmed messages sent: client(yes) server(yes)
  0000050270 [comm.protocol] INFO: rcv'd message type=6
  0000050271 [app] INFO: Restart event occurred!!!
  0000050280 [app] INFO: Restart event occurred!!!

This works in a same or similar way on system firmware 1.0.1 and 1.2.1


Product OTA handler
#11

@rickkas7 - Any thoughts on this? It’s almost as if “firmware_update_begin” simply happens arbitrarily whenever the flag can be set so this only helps to detect when an update begins, which leaves us in an indefinite holding pattern (or arbitrary delay). Is there a method of programmatically determining if this check has been ran?


#13

One additional data point: This issue only seems to arise when SYSTEM_THREAD(ENABLED);

When it is disabled, the update event still comes in after the shutdown has begun but when the update packet comes in, the update process seems to take over the device and force itself into an update. Notably, I am getting an update_failed event, but the update seems to be working anyway.

void turnOff() {
  appLog.trace("turnOff()");
  waitForFirmwareUpdate();
  ledPowerOffAnimation();
  appLog.info("Turning off after a 3 second log flush delay--MARK.");
  if (appLog.isInfoEnabled())  delay(3000);
  while(1) System.sleep(SLEEP_MODE_SOFTPOWEROFF, LONG_MAX);
}

// Disables the LED thread and plays a POWER OFF blink animation
void ledPowerOffAnimation() {
  appLog.trace("ledPowerOffAnimation()");
  os_mutex_lock(ledControlMutex);

  for (uint8_t i = 0; i < 10; i++) {
    appLog.trace("outer loop");
    for (uint8_t j = 0; j < 2; j++) {
      delay(200);
      if (j) digitalWrite(PIN_LED, HIGH);
      else   digitalWrite(PIN_LED, LOW);
    }
  }
  digitalWrite(PIN_LED, LOW);
}

The update is taking over during the short delay in the ledPowerOffAnimation()!

(not highly desirable either)

  0000017129 [app] TRACE: loop() gLoopState=ST_TURNING_OFF
  0000017140 [app] TRACE: turnOff()
  0000017140 [app] INFO: Firmware wait/check started with status FW_NONE
  0000017143 [app] INFO: Firmware wait/check done with status FW_NONE
* 0000017149 [app] TRACE: ledPowerOffAnimation()
* 0000017153 [app] TRACE: outer loop
* 0000017558 [app] TRACE: outer loop
* 0000017958 [app] TRACE: outer loop
  0000018132 [app] INFO: Firmware update beginning event was triggered.
  0000018463 [comm.protocol] INFO: rcv'd message type=5
  0000018538 [app] INFO: Firmware update progress event was triggered.
  0000018540 [comm.protocol] INFO: rcv'd message type=7
  0000018617 [app] INFO: Firmware update progress event was triggered.
  0000018619 [comm.protocol] INFO: rcv'd message type=7
  0000018696 [app] INFO: Firmware update progress event was triggered.
  0000018698 [comm.protocol] INFO: rcv'd message type=7
  0000018775 [app] INFO: Firmware update progress event was triggered.
  ...
  0000068800 [comm] INFO: Update done 80
  0000069270 [comm.protocol] INFO: rcv'd message type=6
  0000069350 [app] INFO: Firmware update progress event was triggered.
  0000069352 [comm.protocol] INFO: rcv'd message type=7
  0000073531 [comm] INFO: Update done 44
  0000073723 [app] ERROR: Firmware update failed event was triggered.
  0000073723 [comm] INFO: Waiting for Confirmed messages to be sent.
  0000074733 [comm] INFO: All Confirmed messages sent: client(yes) server(yes)
  0000074734 [app] INFO: Restart event occurred!!!
  0000074750 [app] INFO: Restart event occurred!!!

Ideally we should have more control over the process than this, even though disabling the system thread does seem to lead to a more aggressively enforced update process.

This has more to do with the fact that the system thread is preventing the application thread from shutting down, and is still down to timing alone rather than a guaranteed version check.

I have major concerns that anything I do will stop working as soon as logging is turned off.


#14

Just giving my two cents on this…

We’ve been trying probably everything possible for that, “planned” updates NEVER worked for us. You may have to wait for minutes before the update happens (unless updateEnabled is called before connecting, but the thing is, you usually DON’T want the update to interrrupt whatever the device is doing (sampling, saving, etc…) and that WILL happen if you let the system decide to update. We even tried to enable updates at the end of our processing, wait for minutes… never happens… another attempt was to force a connection refresh (publishing to some event I can’t remember, as mentioned in another thread, I can find that for you…) but that didn’t work either.

How we ended doing it: easy, we send a remote command (programmed) to put the device on SAFE mode when it connects, and we force the firmware from the Particle console. Pretty lame, to be honest, especially if you deal with more than the 20 or so devices we have on various sites…

We haven’t tried the enterprise update either, because I don’t think there should be an additional cost for a feature that’s supposedly part of the Particle benefits, updating firmware.

There’s a paradox here: Particle is advertised as amazing on remote locations (which it is), but then most of the time you have to put the device on sleep to save battery, and then you can’t benefit of the firmware upgrade feature so efficiently. The way it should work is, you would send a message to Particle if an update is available, to say: let’s do it! Then it would (magically?) happen…

Anyways, as I said, just my two cents. Try SAFE mode if nothing else works :wink: (with a drawback, however: when in safe mode you lose control over your device, especially if connection becomes unreliable…)


#15

@peergum I think you’re right on with a lot of this.

All I need is a way to know that the system firmware hasn’t wrapped up its business yet so I can hold for it. There is still an asynchronous Time Sync messages that comes in after I am ready to shutdown, as well…

There are clearly things that the system (either the device or the server) has queued up but I have no way to know that.

–

I’m not even terribly concerned with the main operation getting interrupted… I just need some kind of view of what the API is trying to hide from me.


#16

You need to make sure that you are calling Particle.process() before you make any decisions based on the status of the firmware update flags. In SYSTEM_THREAD all system event handlers are called during Particle.process() only. There is no push event into your thread.