How to force a handshake for OTA updates


#1

Hi,

I have 50 devices in the field and from time to time I need them to do OTA updates. However, often times I have noticed that it doesn’t work even though on the particle cloud, the device is locked to a new firmware version. Why is this?

What I’ve noticed, if I do a hard reset of devices, it handshakes with the cloud. If a handshake occurs, then it will do the OTA update just fine. What if I want one handshake to happen every night at 8pm to basically check and allow for OTA updates. How can I do that? I can’t go hard reset each device for OTA updates. That kind of defeats the purpose.

Thanks!


Force a product Update
OTA updates for Products
Boron Pending Firmware Updates only applied after power off/on
How to notify application incase of an incoming OTA update?
OTA Update of Electron
#2

You can just Particle.disconnect(), wait a few seconds and Particle.connect() again.


#3

Okay, so to trigger a handshake I just need to have somewhere in the loop() or setup() function

Particle.disconnect(); delay(3000); Particle.connect();

and on the Particle.connect() it will do a handshake?

Currently, this is what I’m doing:

void setup() 
{
    initAll();
    takeMeasurements();
    slowClock();
    sprintf(publishString, "{\"m_min\":%d,\"m_q1\":%d,\"m_med\":%d,\"m_q3\":%d,\"m_max\":%d,\"m_soc\":%f}", m_min, m_q1, m_med, m_q3, m_max, m_soc);
    unsigned long t_now = Time.now();   // seconds since Jan 1 1970 (UTC)
    int Sleep_Duration = getSleepDelay(t_now);
    System.sleep(SLEEP_MODE_SOFTPOWEROFF, Sleep_Duration);
   
    Particle.connect();
    waitUntil(Particle.connected);
    DPRINTLN("--- setup ====================================");
}

void loop() 
{
    DPRINTLN("+++ loop *************************************");
    if (published == false)
    {
        DPRINTLN("loop branch 1; publishing");
        published = Particle.publish(EVENT_NAME, publishString, PRIVATE);
        DPRINTF("Published with return value: %d (1 means success)\n", published);
    }
    else
    {
        DPRINTLN("loop branch 2; delay, disconnect, write to eeprom, go to sleep");
        delay(DELAY_CLOUD);
        Particle.disconnect();
        unsigned long t_now = Time.now();   // seconds since Jan 1 1970 (UTC)
        int Sleep_Duration = getSleepDelay(t_now);
        DPRINTF("End of connected cycle; time read: %s\n", Time.format(Time.now(), TIME_FORMAT_DEFAULT).c_str());
        DPRINTF("Going to sleep for %d seconds\n", Sleep_Duration);
        System.sleep(SLEEP_MODE_SOFTPOWEROFF, Sleep_Duration);
    }
    DPRINTLN("--- loop *************************************");
}

Basically, you’re saying if I add into the beginning of the loop function that snippet of code of Particle.disconnect(), and Particle.connect() it will do a handshake?


Code Not Working after 15 Days in the Field
#4

Basically yes, but you don’t want to disconnect each iteration of loop() and while a connection attempt is ongoing you don’t want to interfere with that either, so you need to only do this under certain conditions.

And above all, you’d need to remove (or make conditional) the System.sleep() call in setup() in order to ever get into loop() :wink:

BTW, I see your DPRINTF() macros, did you know about the Log feature?


#5

@ScruffR Thank you for that. I reworked the code quite a bit and did a lot of testing, but there seem to be some glitches with
Particle.disconnect(); delay(3000); Particle.connect();

What I saw was that it didn’t handshake upon reconnection, but when it was reset again (in the form of System.reset(); or System.sleep();, it would usually handshake on the next reconnecting. However, this was very glitchy because putting this code before or after
Particle.process(); delay(15000); Particle.process();
would change if it would handshake or not.

Here is my code I used for testing:

/*----------------------------------------------------------------------------------------------------------------------
 *                                      INCLUDES
 */
#include "Particle.h"
#include "cellular_hal.h"
#include <math.h>
#include <HC_SR04.h>

/*----------------------------------------------------------------------------------------------------------------------
 *                                      CONSTANTS
 */
#define EVENT_NAME                  "M"
#define MEASUREMENT_COUNT           21

#define BUFFER_SIZE                 256

#define CELL_APN                    "hologram"
#define CELL_USERNAME               ""
#define CELL_PASSWORD               ""

#define EEPROM_TIME_ADDRESS         0

#define ELECTRON_PRODUCT_ID         ----
#define ELECTRON_PRODUCT_VERSION    1

#define DELAY_CELL                  180000      // Milliseconds
#define DELAY_CLOUD                 180000      // Milliseconds
#define DELAY_SETTLE                10000       // Milliseconds
#define DELAY_24_HOUR               86400       // Seconds
#define DELAY_CONNECT_TIME          500         // Milliseconds
#define DELAY_PIN                   100         // Milliseconds
#define DELAY_MEASUREMENT           75          // Milliseconds
#define FOUR_HOUR_WAIT              14400       // Seconds
#define SEC_IN_HOUR                 3600        // Seconds
#define DELAY_UPDATE_TIME           15000       // Milliseconds

#define PIN_ECHO                    D5          // Connect HC-SR04 Range finder as follows:
#define PIN_RELAY                   D3          // GND - GND, 5V - VCC, D4 - Trig, D5 - VoltageDivider.
#define PIN_TRIG                    D4          // VoltageDivider (470 OHM RESISTORS FROM D5 TO GND AND TO ECHO)

#define RANGE_MAX                   400
#define RANGE_MIN                   0.5

#define DEBUG                       1
#define BAUD_RATE                   9600

/*----------------------------------------------------------------------------------------------------------------------
 *                                      MACROS
 */
#if DEBUG
# define SERIAL_DEBUG_BEGIN(x)      Serial.begin(x)
# define DPRINTF(...)               Serial.printf(__VA_ARGS__)
# define DPRINTLN(x)                Serial.println(x)
#else // do nothing
# define SERIAL_DEBUG_BEGIN(x)
# define DPRINTF(...)
# define DPRINTLN(x)                
#endif

/*----------------------------------------------------------------------------------------------------------------------
 *                                      CONFIGURATION
 */
PRODUCT_ID(ELECTRON_PRODUCT_ID); // Product ID
PRODUCT_VERSION(ELECTRON_PRODUCT_VERSION); 

STARTUP(cellular_credentials_set(CELL_APN, CELL_USERNAME, CELL_PASSWORD, NULL)); 
STARTUP(System.enableFeature(FEATURE_RESET_INFO)); // To enable reading why the system was reset.

  SYSTEM_MODE(SEMI_AUTOMATIC); // Cloud connecting at SEMI_AUTOMATIC can provide power savings.

HC_SR04 rangefinder(PIN_TRIG, PIN_ECHO, RANGE_MIN, RANGE_MAX); // initializes HC_SR04 object

/*----------------------------------------------------------------------------------------------------------------------
 *                                      GLOBALS (fix to local variables)
 */
int m_min = -1;
int m_q1 = -1;
int m_med = -1;
int m_q3 = -1;
int m_max = -1;
float m_soc = 0.0;

long hours; // holds hours until wake
long minutes; // holds minutes until wake
long seconds; // hold seconds until wake
long wakeTimer; // combination of hours, minutes, seconds

int WeeklyUpdateAddr = 10; // Holds the address of the Weekly Update bool in EEProm
bool WeeklyUpdate; // check if should update firmware or not
bool publish;

char publishString[BUFFER_SIZE];
/*======================================================================================================================
 *                                      PARTICLE MAIN CYCLE ENTRY POINTS
 */
void setup() 
{
    DPRINTLN("+++ setup ====================================");
    
    initAll();
    takeMeasurements(); // take measurements before stringing together data to publish
    slowClock();
    snprintf(publishString, sizeof(publishString), "{\"m_min\":%d,\"m_q1\":%d,\"m_med\":%d,\"m_q3\":%d,\"m_max\":%d,\"m_soc\":%f}", m_min, m_q1, m_med, m_q3, m_max, m_soc);
    
    Cellular.on(); // explicitly trigger turning cell on - precautionary step
    Cellular.connect(); // explicitly trigger cell connect - precautionary step
    DPRINTLN("Cell is on and connecting");
    delay(DELAY_CONNECT_TIME); // wait half a second
    Particle.connect(); // needs to be called for semi-automatic mode to work
    delay(DELAY_CONNECT_TIME);
    Particle.process(); // explicitly trigger the background task
    DPRINTLN("Particle is connecting");
   
    if(waitFor(Cellular.ready, DELAY_CELL)) // wait up to 180 seconds for cell to connect
    {
        DPRINTLN("Cell has connected successfully");
        
        if(waitFor(Particle.connected, DELAY_CLOUD)) // wait up to 180 seconds for particle to connect
        {
            DPRINTLN("Particle has connected successfully");
            
            //setWakeTimer(20); // call wake time setter
            wakeTimer = 10; //sleep for two minutes
            
            //delay(5); // this allows handshake, anything more will not work - reasons unknown

            for(int i = 0; i < 5; i++) // Try to connect up to 5 times
            { 
                publish = Particle.publish(EVENT_NAME, publishString, PRIVATE);
                delay(DELAY_CONNECT_TIME);
                Particle.process();
                if(publish)
                {
                    DPRINTLN("Data has been published to cloud");
                    i = 5;
                }
            }
            
            if(!publish)
            {
                Particle.process();
                delay(DELAY_UPDATE_TIME); // 15 second wait - give time for device to identify new firmware
                Particle.process();
                DPRINTLN("Data has not been published - system reset");
                smartReboot();
            }
            
            
            smartReboot(); // Check to see if firmware should be updates
                            
            
            Particle.process();
            delay(DELAY_UPDATE_TIME); // 15 second wait - give time for device to identify new firmware
            Particle.process();
            
            
            DPRINTLN("Sleep set to wakeup at 8PM, Going to sleep");
            System.sleep(SLEEP_MODE_SOFTPOWEROFF, wakeTimer);

        }
        else // cloud did not connect, sleep for four hours and try again
        {
            DPRINTLN("Cell did not connect within 180 seconds - Going to sleep for 4 hours");
            System.sleep(SLEEP_MODE_SOFTPOWEROFF, FOUR_HOUR_WAIT);
        }
    }
    else // cell did not connect, sleep for four hours and try again
    {
        DPRINTLN("Cell did not connect within 180 seconds - Going to sleep for 4 hours");
        System.sleep(SLEEP_MODE_SOFTPOWEROFF, FOUR_HOUR_WAIT);
    }
    
    DPRINTLN("--- setup ====================================");
}

void loop()
{
    // do nothing
}

/*======================================================================================================================
 *                                          HELPER FUNCTIONS
 */    ///=====================================================================================================================
/// <summary>This function called to initialize everything that is needed</summary>    ///=====================================================================================================================
inline void initAll()
{
    //delay(DELAY_SETTLE); // wait 10 seconds for initialization
    DPRINTLN("+++ Initialize +++");
    
    SERIAL_DEBUG_BEGIN(BAUD_RATE);
    
    System.enableUpdates(); // enable system updates (might not work)
    
    pinMode(PIN_RELAY, OUTPUT); // sets relay pin
    
    FuelGauge fuel; // fuel gauge class
    m_soc = fuel.getVCell(); // battery voltage

    DPRINTLN("--- Initialize ---");
}
///=====================================================================================================================
/// <summary>
//  Helper function used to calculate time (in seconds) it takes to wake the device given param targetHour.    ///=====================================================================================================================
void setWakeTimer(int targetHour)
{
    DPRINTLN("+++ Setting Wake Timer +++");
    
    Particle.syncTime(); // Synchronize the time with the Particle Cloud. 
    waitUntil(Particle.syncTimeDone); // wait until Particle sync is complete
    Time.zone(-3.5); // sets time zone to MST (end of daylight savings)
    long localnow = Time.local();
     
    long hournow = Time.hour(Time.local());
    long minutenow = Time.minute(Time.local());
    long secondnow = Time.second(Time.local());
  
    // Wake Time Algorithm
    hours = (targetHour + 24 - (Time.hour(Time.local()))) % 24;
    minutes = (59 - Time.minute(Time.local()));
    seconds = (59 - Time.second(Time.local()));
    wakeTimer = ((hours-1) * 3600) + (minutes * 60) + seconds;
    
    DPRINTLN(" \n--- Wake Timer set ---");
}    ///=========================================================================================    ============================
    /// <summary>Helper function used in takeMeasurements to turn on the solid state relay</summary>
    ///=====================================================================================================================
    void turnOnRelay() 
{
    digitalWrite(PIN_RELAY,HIGH);
    delay(DELAY_PIN);
    DPRINTLN("The Relay turned on");
}   ///=====================================================================================================================
/// <summary>Helper function used in takeMeasurements to turn off the solid state relay</summary>
///=====================================================================================================================
void turnOffRelay() 
{
    digitalWrite(PIN_RELAY,LOW);
    delay(DELAY_PIN);
    DPRINTLN("The Relay turned off");
}
///=====================================================================================================================
/// <summary>
///  This function slows down the clock from 120MHz to 30MHz in order to be more power efficent.  This function was    ///=====================================================================================================================
void slowClock() 
{
    RCC->CFGR &= ~0xfcf0;
    RCC->CFGR |= 0x0090;

    SystemCoreClockUpdate();
    SysTick_Configuration();

    FLASH->ACR &= ~FLASH_ACR_PRFTEN;
    DPRINTLN("Clock was slowed down to 30 MHz");
}
///=====================================================================================================================
/// <summary>
///  Helper function for takeMeasurements that sorts an array of length MEASUREMENT_COUNT using the insertion sort 
///=====================================================================================================================
void sortArray(int myReadings[])
{
    int i, j, x;
    for(uint16_t i = 1; i < MEASUREMENT_COUNT; i++)
    {
        x = myReadings[i];
        j = i - 1;
        while(j >= 0 && (myReadings[j] > x))
        {
            myReadings[j + 1] = myReadings[j];
            j = j - 1;
        }
        myReadings[j + 1] = x;
    }
}
///=====================================================================================================================
/// <summary>
///  This function takes measurements from an ultrasonic sensor and stores the results in 5
///  global variables so that the results are ready to publish
/// </summary>    ///=====================================================================================================================
void takeMeasurements()
{
    DPRINTLN("+++ takeMeasurements +++");
    turnOnRelay(); // Turns on relay that allows power to flow to sensor
    int measurements[MEASUREMENT_COUNT];
    for(int i = 0; i < MEASUREMENT_COUNT; i++) // takes MEASUREMENT_COUNT measurements
    {
        measurements[i] = rangefinder.getDistanceCM();
        delay(DELAY_MEASUREMENT);
    }
    DPRINTLN("Finished Taking Measurements.");
    turnOffRelay(); // Stops power from leaking to sensor
    sortArray(measurements); // Sorts the measurements in ascending order
    // Populates global variables
    m_min = measurements[0];
    m_q1 =  measurements[MEASUREMENT_COUNT / 4 - 1];
    m_med = measurements[MEASUREMENT_COUNT / 2 - 1];
    m_q3 =  measurements[MEASUREMENT_COUNT * 3 / 4 - 1];
    m_max = measurements[MEASUREMENT_COUNT - 1];

    DPRINTLN("--- takeMeasurements ---");
}
///=====================================================================================================================
/// <summary>
///  This function is used to reset the device. We use this because a system reset does not reset the cellular modem.
/// </summary>
///=====================================================================================================================
void smartReboot()
{
    DPRINTLN("+++ Resetting Modem and Sim Card +++");
    
    Particle.disconnect();
    
    Cellular.command(30000, "AT+CFUN=16\r\n");
    Cellular.off();
    delay(1000);
    
    Cellular.on();
    Cellular.connect();
    delay(DELAY_CONNECT_TIME); // wait half a second
    Particle.connect();
    delay(DELAY_CONNECT_TIME);
    Particle.process(); // explicitly trigger the background task
    
    //System.reset();
}    

Basically, what I was seeing in testing, is that in the setup function it only forces the handshake if I call the function smartReboot() before that Particle.process(); delay(15000); Particle.process(); The problem with that however, is that it will never do an OTA update because it doesn’t handshake at that point and check for an update and try to update (that is my suspicion at least).


#6

I’ve had this issue as well. I regularly do updates of 30-50 devices and I’ll usually have one or two devices that won’t update. When this happens I usually just have to wait, maybe 12 hours or so. If I look at the device in the particle console page for devices it will show the device has a last handshake sometime in the last few days. It does not show an updated handshake for the current day.

It seems that Particle could solve this on the web side by allowing us to kill the device’s connection. Something like a “Close Connection” button. If my understanding of the Particle cloud is correct, this would just force the device to reconnect.


#7

Yeah, I think they could control that from the cloud side too, but I don’t think they currently are. It’s causing me a lot of problems because my devices won’t update…


#8

The missing re-handshake you mentioned earlier might be caused by some data-saving regime that was incorporated a while ago.
The device builds a hash code of some session info and all variables, functions, subscriptions you set up and (sort of) sends this to the cloud.
On a later reconnect, the cloud hash will be compared against the local version and if they match the full handshake will be skipped.

A possible workaround would be to have some dummy subscription you “mutate” over time to cause a hash mismatch.


Code Not Working after 15 Days in the Field
#9

I mean, this seems like a pretty glaring issue. Maybe I am missing something.

Most of the people that are using the particle electron have their devices waking up periodically to do some reading, send, then go back to sleep. Most of them want to use as little power as possible. And most of them want to be able to push out OTA firmware updates to a product group.

As far as I can tell, this is impossible currently because of this issue. My devices simply won’t receive OTA updates because they will not re-handshake. Currently, I have mine set up to connect, disconnect, turn off modem, reconnect and stay connected for 10 minutes (not ideal for power saving) and they still will not do a handshake/not receive OTA updates.

How are other people accomplishing this? I really feel like I’m missing something.


#10

Not sure if this helps, but you can wait for 12 hours or so to do the update. Frustrating… but has worked for me. I haven’t had a super critical update yet, so I might not be a little less forgiving in that scenario.

It would be nice to hear from one of the Particle developers on the possibility of killing a cloud connection from the cloud side. This seems like a reasonable feature request, unless someone can say it is impossible?

@ScruffR, I’m not sure I understand what you mean. Does this dummy subscription only work on subscriptions, or would it work for publishing too? When I encounter this bug I can get my devices to sync time and publish data to the particle cloud, but I cannot get them to update. Maybe just publishing and syncing time is not the same as subscribing to a topic?


#11

If the reason for the skipped handshake is due to the necessity-check I mentioned before, then you need to make the hash mutate between sessions. And since you have little control over the hashed session info the easiest way to make the hash-check fail is to unsubscribe and resubscribe to a different set of events.
Particle.publish() does not affect the hash.

BTW, @dcliff9 in addition to power preservation many people also want to limit data usage and hence this kind of data-preservation regime was developed and compromises had to be made.

I haven’t tried, but if you have a sleep periode with no SLEEP_NETWORK_STANDBY or longer than the set keep alive (default 23 minutes) once a day and then keep the device awake for 2 minutes after cloud connect this should allow for OTA update to happen.

But I agree that this needs to be addressed and hence I already have opened an issue on GitHub a long time ago.


#12

I’ve tried your suggestion of making the device sleep for longer than my timeout (which is only 2 minutes), without any luck.

I’ll investigate your suggestion on changing the subscriptions. Right now my devices do not have subscriptions.


#13

That makes sense to me.

Is there a generic few lines of code that you could provide to accomplish and test this?


#14

Right now, my devices do a deep sleep for between 2 and 6 hours. When they come alive, I have them connect to the cloud, send a publish and then disconnect. Every 10th time they wake up, they connect, disconnect, connect again, then stay connected for 10 minutes (set it this long just for this issues).

I can go weeks without showing a handshake on the console. And with no handshake comes no OTA update.


#15

Can you try adding this snippet to your setup() code and check whether my workaround really works?

void dummy(const char* filter, const char* data) {
}

void setup() {
  // all your usual stuff
  ...

  if (someConditionToOnlyDoWhenNeeded == true) {
    // this is just to add mutating data into the hash
    Particle.subscribe(String(Time.now()), dummy, MY_DEVICES); // <-- correction from previously PRIVATE
  }
}

#16

Totally understand the data usage issue. But I think what is confusing to me is why a handshake is required for an OTA update to proceed.
I really wish that System.updatesPending() worked the way it does in my imagination. :frowning:


#17

And that’s why I opened these issues in the past
https://github.com/spark/firmware/issues/1285
https://github.com/spark/firmware/issues/1166


#18

@dcliff9 Can you test @ScruffR 's code to see if it forces your Electrons to handshake when they are otherwise going weeks without updates? Just curious if this code would fix the handshake problem your having which is preventing your updates.


#19

@RWB Yeah. I’m going to try it this weekend and will let you know how it goes, for sure. I make it a rule to always try what @ScruffR says. :slight_smile:


#20

@ScruffR and @dcliff9 I tried it out over the weekend, and it seems to be working. I’ll still keep an eye on it for another couple days and the following weeks, but this may do it!