Currently I’m running more than ten Electrons. All of them are performing data acquisition and sending data to both the Easy IoT cloud and my private database using REST API.
Some of my Electrons occasionally stop sending the REST API messages (to both databases). In most cases it requires a push on the reset button to get it up and running again but sometimes it recovers by itself.
This issue is most frequent when a device is transferred over some distance or in a low/bad signal areas.
I’m using 3rd party SIM cards (Vodafone Iceland).
The problem seems to be unrelated to the firmware version. The versions that I’m running on these devices are:
0.4.8
0.5.3-rc.1
0.5.3-rc.2
0.5.3.
I haven’t tried the 0.6.0 version jet.
I think that the problem might be related to cellular connectivity.
I was hoping some of you might have some input in what might be causing this.
Below is an example of a code that is running on an Electron (0.5.3) that has been showing some problems:
Just a general suspission when problems occure after very long time, your String objects might slowly fragment your heap to a point where network operations may not succeede anymore.
If you replace all your String instances with char[] strings, we can at least tick this off as possible cause.
@brtefla, just a couple of observations besides @ScruffR suggestion:
You may want to change your time variable initialization until AFTER you get the correct synchronized time.
You don’t test for a timeout condition after the while()
Perhaps this would serve you better:
int year, month, day, hour, minute, sec;
int sign = 0;
int batt = 0;
void setup()
{
while(Time.year() < 2000 && millis() < 10000) // Wait for the time to synchronize
{
Particle.process();
delay(200);
}
if (millis() >= 10000) // Check if timeout occured
{
// This is just a suggestion. How you deal with the timeout is your choice
System.reset(); // Reset system and try again
}
year = Time.year();
month = Time.month();
day = Time.day();
hour = Time.hour();
minute = Time.minute();
sec = Time.second();
}
It doesn’t interfere with the network directly, but as we are dealing with a µC with limited resources and no OS as you would find on a computer, memory management is something the programmer has to take into consideration. String objects make use of dynamic (heap) memory and if they grow out of the pre-allocated space (default 16byte) they might get relocated, leaving a gap in their previous spot. Over time these gaps might get so many but so small that other objects (including the system) who might acquire dynamic memory won’t find a free spot big enough for their data and fail to work. This is when the ever growing heap fragmentation causes trouble.
Full fledged OSs regularly perform garbage collection and defragmentation, to “compress” the data to get the memory back, but no such thing is happening here.
Thank you very much for this detailed info!
So basically fragmented Strings are screwing up the memory and microcontroller which then in the end affects the whole device (including the cellular chip)
I’m not saying that is what happens, but that can happen and often is the case when code keeps runing for a long time and then unexpectedly starts misbehaving.
@ScruffR, I’m only using String objects locally in functions, so the memory allocations for the objects should be returned at the end of the function call. I have had a device running the same code above, except for the System.reset(), for two months without a problem.
A few months back I added to my code a call to to the “System.reset()” function an hour after the code starts running. It does not seems to be working in the same way as hitting the physical reset button. It has happened a few times that it turns out to be necessary to hit the reset button on a device that is running the System.reset() function to get it up running if it stops sending messages.
When these stops occur the device seems to stop running my script. The status LED on the Electron can be showing fast blinking blue in these stops but most often it just shows breathing cyan.
Is there any way to keep the script running no matter what the cellular connectivity issue the device is dealing with ? If we assume that that is the problem.
The System.reset() does not effect the cellular modem, which sounds like where your intermittent issues are. There are a few other threads on here that go into more details.
I had the same problem until I incorporated this example into my program. https://github.com/rickkas7/electronsample I now use that for a template for all my electrons.
I wish there was more documentation/awareness about that potential failure mode, I have programs that work great for months sitting on my home work bench, but at my cabin, which actually has higher strength cell signal, they would only stay running for a few weeks, before I added the modem reset.
The short story is that it has not been successful and my devices are sometimes stopping when the smartReset function is called.
Here comes the long story !
What I implemented was the following:
A smartReset function:
void smartReset()
{
infoToSDCard(“smartReset()”, “”);
Particle.disconnect();
// 16:MT silent reset (with detach from network and saving of NVM parameters), with reset of the SIM card
Cellular.command(30000, “AT+CFUN=16\r\n”);
Cellular.off();
delay(1000);
System.reset();
}
A function that checks various things and records the results to a SD card. The function smartReset is also called is some cases. That I find interesting is that I never get RESP_OK from the ping function. As one can see below is that I’m relying on Particle.connected() to detect when the connection is down.
if(Cellular.command(PING_TIMEOUT, "AT+UPING=api.particle.io\r\n") != RESP_OK)
{
infoToSDCard("Ping api.particle.io failed.", "");
}
if(Cellular.command(PING_TIMEOUT, "AT+UPING=8.8.8.8\r\n") != RESP_OK)
{
infoToSDCard("Ping 8.8.8.8 failed.", "");
}
if(Cellular.listening())
{
infoToSDCard("Cellular.listening()", "1");
smartReset();
}
if(!Particle.connected())
{
infoToSDCard("Particle.connected()", "0");
if(cloudCheckStart == 0) // This runs smartReset if connection to the particle cloud has been lost for CLOUD_WAIT_FOR_REBOOT time
{
cloudCheckStart = millis();
}
long int connLostTime = millis() - cloudCheckStart;
infoToSDCard("Time from connection was lost", String(connLostTime));
if(cloudCheckStart > 0 && (connLostTime >= CLOUD_WAIT_FOR_REBOOT)) {
smartReset();
}
}
else
{
cloudCheckStart = 0; // núlla tíma ef samband næst
infoToSDCard("Particle.connected() successful", "");
}
}
A watchdog that calls smartReset after one minute of inactive loop() function. This actually works. If I unplug the antenna the device restarts after a minute of green blinking led.
Additionally I’m calling the smartRest every two hours.
At first after I implemented this functionality it seemed to be more stable than before. But over the past weeks it has turned out to be just as bad as it was before.
I noticed after the implementation that the Electrons were sometimes blinking red but for a maximum of one minute. So its like the watchdog is working when this happens. Sometimes it recovers by it self but not always. The system log on the SD card tells me that the last thing that happened in my code before the electon stopped is a call to the function SmartReset.
So, my questions to you guys are the following:
-Has someone experienced a similar problem with the smartReset function ?
-Has someone tried the ping function ?
-Are you in general monitoring the cellular connectivity and taking some actions in case of no connection ?
-Has someone tried to keep the modem turned off except when it is needed ? I know it does not make sense for all applications but when you are sending measurements every ten minutes it might help.
I’m not 100 % certain the smart reset function will work properly in system thread enabled mode. I wrote it before I was aware of some of the difficulties with that. My suggestion is try putting the Electron in SLEEP_MODE_DEEP for maybe 10 seconds instead and see if that works more reliably.