Issues with firmware 0.7.0

I have had such a problem with a recent product release based on 0.7.0 that I have jumped to 0.8.0-rc.8 to run diagnostics on 5 out of fleet of 300. I have made all the String class removal changes other than in the handling of Particle function calls.

I am waiting for a reply from Particle Support about exactly how to read the diagnostics download. If anyone on this thread understands the column headings and values it would be great to understand them and especially know what tell-tales to look out for.

Initial impression is that 0.8.0 suffers less than 0.7.0 with cloud disconnections and also with wifi network disconnections. I believe the diagnostics rule out heap fragmentation - about 49K used out of 83K and stable (some fluctuation but no downward slope). No Particle function calls were made during the tests.

Attached here is one download with the most recent report at the top - system has been up for 8754 seconds, 1 network dropout. The system memory used went up to 52K - signal strength and quality no change.

Link to another thread with output from 0.8.0 diagnostics explained. https://community.particle.io/t/remote-diagnostics-feedback/42715/4?u=armor

I just got some time to finish a new version without the use of String objects. The test devices were flashed, same ones, now need to wait for several hours to see if the problem surfaces again.

// FIRMWARE VERSION **********************
#define FIRMWARE_VERSION "cloud_test_2";

const int NUM_SCHEDULES = 25;
const int SCHEDULE_LENGTH = 27;


// FUNCTION DECLARATIONS ********************************
int transmit(String command);
int setSchedule(String command);
int setVariable(String command);
int callTest(String command);


// Variables

char scheduleString[NUM_SCHEDULES * SCHEDULE_LENGTH];
char strcountTcpUsage[20];
char statusVariables[] = FIRMWARE_VERSION;
char networkVariables[200];
char lastSchedule[100];     //schedule + time
char lastTransmit[25];
char locationVar[200];
char lastLostSubscription[26];

IPAddress myLocalIp;
const char* myPublicIP;
const char* ssid;
char rssi[10];
char freeMemory[10];
char bufferPublish[255];
const char* serialName;

unsigned long previousMillis;
unsigned long cycleTimer_ms;


void setup() {
    
    Particle.function("transmit", transmit);
    Particle.function("setschedule", setSchedule);
    Particle.function("setvariable", setVariable);
    Particle.function("calltest", callTest);

    // Exposed Variables - max 20 (max name length is 12 char)
    Particle.variable("status", statusVariables);
    Particle.variable("network", networkVariables);
    Particle.variable("schedulelist", scheduleString);
    Particle.variable("lastschedule", lastSchedule);
    Particle.variable("lastLostSubs", lastLostSubscription);
    Particle.variable("lasttransmit", lastTransmit);
    Particle.variable("location", locationVar);
    Particle.variable("countTcpCall", strcountTcpUsage);
    
    // Subscriptions
    Particle.subscribe("particle/", handler);				// to handle Spartk web responses, like the Public IP
    
    updateWiFiInfo();
}

void loop() {
    cycleTimer_ms = millis() - previousMillis;
    
    if(millis() - previousMillis > 250){
        sprintf(bufferPublish, "loop_delay = %dms, free_memory = %d", cycleTimer_ms, System.freeMemory());
        Particle.publish("debug/loop_delay", bufferPublish, 60, PRIVATE );
    }
    
    previousMillis = millis();
}


int transmit(String command){
    Particle.publish("debug/transmit", command, 60, PRIVATE );
    return 1;
}

int setSchedule(String command){
    return 1;
}

int setVariable(String command){
    updateWiFiInfo();
    return 1;
}

int callTest(String command){
    
    if(command.compareTo("reset") == 0){
        System.reset();
    }
    else if(command.compareTo("rssi") == 0){
        sprintf (rssi, "rssi = %d", WiFi.RSSI());
        Particle.publish("debug/rssi", rssi, 60, PRIVATE );
    }
    else if(command.compareTo("memory") == 0){
        sprintf (freeMemory, "free memory = %d", System.freeMemory());
        Particle.publish("debug/free_memory", freeMemory , 60, PRIVATE );
    }
    else if(command.compareTo("ssid") == 0){
        Particle.publish("debug/ssid", ssid, 60, PRIVATE );
    }
    return 1;
}


void updateWiFiInfo(){
   sprintf (rssi, "rssi = %d", WiFi.RSSI());
   ssid = WiFi.SSID();
   myLocalIp = WiFi.localIP();
   if (Particle.connected()){
       Particle.publish("particle/device/ip");
       Particle.publish("particle/device/name");
   }
}


void handler(const char *topic, const char *data) {

    //Spark.publish("received " + String(topic) + ": " + String(data));

    if(strcmp ( "particle/device/ip", topic) == 0){
        myPublicIP = data;
    }
    else if(strcmp ( "particle/device/name", topic) == 0){
        serialName = data;
    }

}
2 Likes

@ScruffR, using the new code, I can see memory degradation. From a originals of 50K of free memory, some of the test devices are down to 38K and starting to lose Cloud connection from time to time.

Unfortunately, I needed to reboot the test that means I wonā€™t have final results until tomorrow or Monday.

2 Likes

Did you call any of the functions or request variables or trigger the subscription or did you just let it do itā€™s stuff without any intervention?

@ScruffR noticed, like I did, that you are publishing within a subscribe callback. The two share the same buffer which could be causing problems.

2 Likes

@ScruffR I do use function transmit() to check if console still working and calltest() to check the free memory. Not frequently, maybe once each 30-60 min.

1 Like

I donā€™t want to muddy the waters in this great train of thought, but I wanted to ask one question regarding the removal of all String objects. I too have had my share of disconnects since upgrading to 0.7.0. I got rid of all Strings, even in the implementation of Particle functions. Here is an example of a Particle function after my modifications:

int setDisableOverride(const char * cmd)
{
    if (strncmp(cmd, "enable", 6) == 0)
    {
        disableOR = true;
        return 1;
    }
    else if (strncmp(cmd, "disable", 7) == 0)
    {
        disableOR = false;
        return 1;
    }
    else
    {
        return 0;
    }
}

Is there anything wrong with my changes? I added a Freeboard variable to track free memory in the loop() and on the Photon the max is 57,692 and the min is 55,164. This is after days of running.

@bacichetti, am I missing something obvious, or do you have two buffer overflows? From your code, look at the following:

char rssi[10];
char freeMemory[10];

sprintf (rssi, "rssi = %d", WiFi.RSSI());
sprintf (freeMemory, "free memory = %d", System.freeMemory());

Each of your char variables setup a space to hold 9 characters and the null character. For the rssi variable, with a two-digit RSSI (and negative sign), you have a total of 11 characters including the null character. For the freeMemory variable, assuming a 5-digit memory value, you are trying to put approximately 20 characters into the variable. As soon as I loaded your coded into my Photon running 0.7.0, it immediately crashed and started flashing red.

Am I missing something obvious?

2 Likes

Thatā€™s where snprintf(buffer, sizeof(buffer), ...)helps :wink:

I canā€™t agree enough. Along with checking all char []'s for proper size, getting rid of all String variables, and checking variable return types, the other thing I did to improve overall reliability was converting to every function with a max size parameter, including:

  • strncat
  • strncmp
  • strncpy
  • snprintf
3 Likes

Hi, @syrinxtech, youā€™re absolutely right, thanks for the catch. I meant it to be:
My devices keep working even with the errorā€¦ Iā€™ll update the code adding your other suggestions and flash again.

sprintf (bufferPublish, "rssi = %d", WiFi.RSSI());
sprintf (bufferPublish, "free memory = %d", System.freeMemory());

@ScruffR, thanks for the tip!

1 Like

I havenā€™t gone totally non-String, I am not seeing steady or even accelerating free memory degradation but there come still be fragmentation.

What do you use/recommend to replace string.readStringUntil(ā€˜cā€™) ? strtok or strchr

When you receive function calls do you parse appended data string.substring() ? strncpy

Lastly, to convert string to int or float, rather than string.toFloat or string.toInt ? sscanf()

Many Thanks

After correcting the issues found by @syrinxtech, I re-started the test. And after several hours all 6 devices started going offline/online. A called a function about 8-10 times during the day. I called a variable a couple of times only.

I reboot my router mid-test for a different reason, not sure it could have influenced the test.

Here is the latest version. Tomorrow Iā€™ll start to cut some parts to see whatā€™s the core issue.

// FIRMWARE VERSION **********************
#define FIRMWARE_VERSION "cloud_test_3";

const int NUM_SCHEDULES = 25;
const int SCHEDULE_LENGTH = 27;


// FUNCTION DECLARATIONS ********************************
int transmit(const char * command);
int setSchedule(const char * command);
int setVariable(const char * command);
int callTest(const char * command);


// Variables

char scheduleString[NUM_SCHEDULES * SCHEDULE_LENGTH];
char strcountTcpUsage[20];
char statusVariables[] = FIRMWARE_VERSION;
char networkVariables[200];
char lastSchedule[100];     //schedule + time
char lastTransmit[25];
char locationVar[200];
char lastLostSubscription[26];

IPAddress myLocalIp;
const char* myPublicIP;
const char* ssid;
char rssi[10];
char freeMemory[10];
char bufferPublish[255];
const char* serialName;

unsigned long previousMillis;
unsigned long cycleTimer_ms;


void setup() {
    
    Particle.function("transmit", transmit);
    Particle.function("setschedule", setSchedule);
    Particle.function("setvariable", setVariable);
    Particle.function("calltest", callTest);

    // Exposed Variables - max 20 (max name length is 12 char)
    Particle.variable("status", statusVariables);
    Particle.variable("network", networkVariables);
    Particle.variable("schedulelist", scheduleString);
    Particle.variable("lastschedule", lastSchedule);
    Particle.variable("lastLostSubs", lastLostSubscription);
    Particle.variable("lasttransmit", lastTransmit);
    Particle.variable("location", locationVar);
    Particle.variable("countTcpCall", strcountTcpUsage);
    
    // Subscriptions
    Particle.subscribe("particle/", handler);				// to handle Spartk web responses, like the Public IP
    
    updateWiFiInfo();
}

void loop() {
    cycleTimer_ms = millis() - previousMillis;
    
    if(millis() - previousMillis > 250){
        snprintf(bufferPublish, sizeof(bufferPublish), "loop_delay = %dms, free_memory = %d", cycleTimer_ms, System.freeMemory());
        Particle.publish("debug/loop_delay", bufferPublish, 60, PRIVATE );
    }
    
    previousMillis = millis();
}


int transmit(const char * command){
    Particle.publish("debug/transmit", command, 60, PRIVATE );
    return 1;
}

int setSchedule(const char * command){
    return 1;
}

int setVariable(const char * command){
    updateWiFiInfo();
    return 1;
}

int callTest(const char * command){
    
    if(strncmp(command,"reset",5) == 0){
        System.reset();
    }
    else if(strncmp(command,"rssi",4) == 0){
        snprintf (bufferPublish, sizeof(bufferPublish), "rssi = %d", WiFi.RSSI());
        Particle.publish("debug/rssi", bufferPublish, 60, PRIVATE );
    }
    else if(strncmp(command,"memory",6) == 0){
        snprintf (bufferPublish, sizeof(bufferPublish), "free memory = %d", System.freeMemory());
        Particle.publish("debug/free_memory", bufferPublish , 60, PRIVATE );
    }
    else if(strncmp(command,"ssid",4) == 0){
        Particle.publish("debug/ssid", ssid, 60, PRIVATE );
    }
    return 1;
}


void updateWiFiInfo(){
   snprintf (rssi, sizeof(rssi), "%d", WiFi.RSSI());
   ssid = WiFi.SSID();
   myLocalIp = WiFi.localIP();
   if (Particle.connected()){
       Particle.publish("particle/device/ip");
       Particle.publish("particle/device/name");
   }
}


void handler(const char *topic, const char *data) {

    //Spark.publish("received " + String(topic) + ": " + String(data));

    if(strncmp (topic,"particle/device/ip",18) == 0){
        myPublicIP = data;
    }
    else if(strncmp (topic,"particle/device/name",20) == 0){
        serialName = data;
    }

}
1 Like

Here is what I see on my console:

@armor, here is a code snippet of a Particle function:

int setHighTH(const char * cmd)
{
    float t;
    
    t = atof(cmd);
}

I like all of the atoX() calls, including atof(), atoi(), etc. It depends on the type of data that the string contains.

@bacichetti, Iā€™m not sure I would use the following snippet from your code:

#define FIRMWARE_VERSION "cloud_test_3";

// Variables
char statusVariables[] = FIRMWARE_VERSION;

Call me old fashioned, but I never try setting one char array to another without a strncpy() or snprintf(). Something like this:

#define FIRMWARE_VERSION "cloud_test_3";

// Variables
char statusVariables[strlen(FIRMWARE_VERSION)+1];

strncpy(statusVariables, FIRMWARE_VERSION, sizeof(statusVariables)-1);

You could even make FIRMWARE_VERSION a char array:

char FIRMWARE_VERSION[] = "cloud_test_3";

or

char FIRMWARE_VERSION[13] = "cloud_test_3";

You should not forget the return someInt; statement at the end of your function that's supposed to return and int.

It's perfectly fine what @bacichetti does here since he isn't assigning a string to another, but he is initialising a character array with a #defined macro.

The preprocessor does first translate

#define FIRMWARE_VERSION "cloud_test_3";

// Variables
char statusVariables[] = FIRMWARE_VERSION;

to

// Variables
char statusVariables[] = "cloud_test_3";

and then the compiler sees this as initialisation of statusVariables[] which is just fine.

However, since the size of statusVariables is dictated by the size of the string literal, I'd rather have it as const char[] to avoid any attempt to assign any string longer than allowed and I'd just do away with the FIRMWARE_VERSION macro and rather have a line

const char FIRMWARE_VERSION[] = "cloud_test_3";

at the top of the project (but then the initialisation of statusVariables[] wouldn't work the same way anymore).

BTW, the const modifier also ensures that there are no two copies of the string literal.
Without const the literal will have to be stored in flash and then also be copied to RAM on initialisation.
With const the variable will just be pointed to the place in flash where the string literal lives and no copy is required.

I guess that is a Particle blessing on atox() function - these all fail silently though!

What about the parsing of a command string?

I can ensure you atoi() and atof() work just fine.
If they fail, can you show your code?
For integer values you can also use sscanf() which is particularly useful for multi variable parsing.

But this kind of question is actually not related to the original topic of this thread, so please donā€™t derail the
0.7.0 discussion - your particular questions are touching on all Device OS versions and have been answered in other threads already.

I am just referring to stackoverflow debate about the pros and cons of say atoi() versus strtol() or sscanf(). I have been using atoi() and atof() but I am careful what I give them. Sorry for getting slightly off track!

Ref Issues with 0.7.0 - I am down grading all new products to 0.6.3 because I can OTA flash to 0.7.0 or later but canā€™t OTA from 0.7.0 to 0.6.3. I have had diagnostics running with 0.8.0-rc.8 on 5 devices that were exhibiting problems with connections on 0.7.0. Results are with Dave and Rick @ Particle but it is now 4th July holidays in US :tada:! Most obvious reason I could see for disconnection was coap round trip times were very high. Neither signal strength nor data quality nor memory appeared to be triggers. I am still trying to understand the diagnostics though - I am promised the meanings of the status values in the reports.

3 Likes