With our firmware, 0.8.0-rc.11 crashes when MODE button is pressed

When upgrading to 8.0.0-rc.11, I notice a lot of system instability. In particular, pushing the MODE button causes the device to crash immediately. This makes going into listen mode impossible.

The SOS code is #10, Assertion failure.

We don’t see this problem on 0.7.0.

Note that this does not occur if we use the below simple example:

#include "Particle.h"

// Set your 3rd-party SIM APN here
// https://docs.particle.io/reference/firmware/electron/#setcredentials-
STARTUP(cellular_credentials_set("broadband", "", "", NULL));

SYSTEM_MODE(AUTOMATIC);

/* This function is called once at start up ----------------------------------*/
void setup()
{
    // Turn off hardware charging
    PMIC().disableCharging();

    // Set the keep-alive value for 3rd party SIM card here
    // https://docs.particle.io/reference/firmware/electron/#particle-keepalive-
    Particle.keepAlive(60);
}

/* This function loops forever --------------------------------------------*/
void loop()
{
    //This will run in a loop
}

Therefore we can conclude that this is obviously due to an interaction with user firmware, and not just because the button does not work in 0.8.0.

Interested in thoughts on how to track down the problem. For info, we are using system threads and MANUAL mode.

1 Like

Thanks for the report – will try to replicate using an rc.11 device at the office in a couple of hours. Have forwarded your report to our internal team for investigation as well.

More info to come!

2 Likes

@kubark42 If it doesn’t happen on 0.8.0-rc.10, then you are probably dynamically allocating memory in an interrupt. An assert was added in 0.8.0-rc.11 for that. Please give that test a try.

3 Likes

I could not reproduce this with 0.8.0-rc.10. I tried -rc.10 and -rc.11 twice each.

I haven’t written any specific interrupts, although there are MQTT and timer callbacks. My understanding was that those were xTasks. There’s also the Serial lib @rickkas7 wrote, https://github.com/rickkas7/SerialBufferRK. When I look through the code there, it also looks like there’s no interrupt.I am certainly allocating some dynamic memory in those because I have some debug strings. Should I remove all dynamic allocations from non-main tasks?

So it just occurred to me that you specifically mentioned you are pressing the MODE button when this happens. Are you using the system event handler for button presses and doing some dynamic memory allocation in that handler?

Pressing the MODE button in general on the Electron with 0.8.0-rc.11 still works fine: 1 press flashes the number of bars of signal strength and 2 presses puts the electron in soft power off mode.

Are you using Electron, Photon or P1? Photon/P1 doesn’t do anything by default though with the MODE button.

Using an Electron, not using the button in the software. I only noticed the problem because I wanted to put the Electron into listening mode so that I could query the firmware to make sure the upgrade went through correctly.

P.S. This was part of a lab experiment to see how we can OTA upgrade firmware in the field. I love that all I have to do is push firmware compiled against 0.8.0-rc.X and the safe mode will automatically upgrade the OS firmware!

1 Like

Oh hmm... how much free memory do you have before you enter Listening Mode? I believe it needs about 21KB now, but that really shouldn't have changed much since 0.7.0. The SOS 10 is happening after a 3 second count of holding the MODE button?

The SOS happens immediately when I press the MODE button.

The RAM has around 69k free when I looked yesterday.

P.S. Happy to share the firmware if this becomes an interesting problem. We have an NDA in place with Particle.

That is quite interesting. Are you using any system events? If so, try to disable all of them first.

Would it be possible for you to chop out large parts of your firmware until you narrow down to the thing (or a limited number of things) that could be causing it? That would be a big help, since you know your application firmware best. Another thing is just getting your application firmware to run on an Electron without any of your external hardware connected, does the problem still persist?

Other than that, the easiest way to figure out what's going on is a JTAG/SWD debugging session where you cause the issue, and then run a backtrace to figure out what called the SOS 10. I could link you here to some good tutorials, but it's a rabbit hole to go down if you've never done it before. I'm really looking forward to this being made easier with Workbench and the new Particle debugger.