Again and again individual failures of Electrons

Hi community,

I fight again serval problems for many days. I hope, you can help me to find the reason. The symptoms are that no events of these electrons recieved my application and the Particle console. I have a python app which waits for events using SSE. This interface works. The problem must be at the electron side. I have running 4 electrons in a field test. So I can not really produce logs which could help me and I cannot watch them the whole time. I only see the physical status when I detected that their is no event for some hours. Sometimes the status led is off, sometimes it breathe white and sometimes i flash green forever (I waited maximum 10 minutes before I tried a reset. After reset, the system works again for a few hours. Then the same issue is there.)

First I thought, it could be the current problem of a Particle service (topic “Electron won’t publish events”) because sometimes I had the issue of fast cyan flashing with red flashes. After Particle’s fix, this disappeared but the main problem is still there.

Second I thought, it could be the “String - heap” problem because the problems appears irregular after a few hours or some days. So I replaced all strings with char arrays. But this doesn’t change anything.

Third I thought it could be a problem, that I don’t always have a Cellular.on() before my System.sleep(…DEEP…) and I use the semi automatic mode. So I put the Cellular.on() before the sleep command, but this also didn’t help against the problem. I have Particle SIM cards.

The code on my electrons do the following:

  • Each 15 minutes it measures four values with two sensors and the onboard power unit: 2x temperature, 1x humidity, 1x SOC
  • The data is stored at the EEPROM in four queues
  • After the 8 round of measurement, the data of the queues is published to the Particle cloud
  • The code also have a cloud function to get borders for faster publish intervals.

On the whole, it is a temperatur sensor which publish the measurement in 2 hour intervals. The python app generates alertings and does other stuff. The alterting thresholds are saved on the electron via cloud function. So the electron publish immediately (before the 2h interval) when the threshold/border is breaked.

My english is not the best, so maybe the terms are wrong :wink:

This is my code (without Serial prints):

#include "SparkJson/SparkJson.h"
#include "SHT1x/SHT1x.h"
#include "OneWire/OneWire.h"
#include "math.h"

PRODUCT_ID(XXXX);
PRODUCT_VERSION(X);
SYSTEM_MODE(SEMI_AUTOMATIC);

OneWire ds = OneWire(D4);
FuelGauge fuel;

const unsigned long timeToSleep = 60 * 15; // 15 minutes
const int size_of_queue   = 8; // reserve var * 4 bytes for addresses = 2 hours
const int steps_to_force = size_of_queue * 1; // force sending after 2 hours

int address_counter_force = 10;
int address_tempDS     = 20;
int address_tempSH     = 70;
int address_humiSH     = 120;
int address_soc        = 220;
int address_tempDS_max = 270;
int address_tempDS_min = 320;
int address_tempSH_max = 370;
int address_tempSH_min = 420;
int address_humiSH_max = 470;
int address_humiSH_min = 520;

float tempDS_array[size_of_queue];
float tempSH_array[size_of_queue];
float humiSH_array[size_of_queue];
float soc_array[size_of_queue];

// default values of borders to avoid RTEs
float ds_temp_max = 0;
float ds_temp_min = 0;
float sh_temp_max = 0;
float sh_temp_min = 0;
float sh_humi_max = 0;
float sh_humi_min = 0;
int current_counter_force = 0;
bool border_crashed  = false;

char publishDString[80];
char publishSString[80];
char publishHString[80];
char publishPString[80];

char* latitude;
char* longitude;

SHT1x sht1x(0, 1); // dataPin, clockPin

void setup() {
    Cellular.off();
    Particle.function("borders", saveBorders);
}

int saveBorders(String json) {    
    // parse string
    int lastFound    = -1; // needed to get first char
    int currentFound = 0;
    int step         = 0;
    float borders[6];
    while(currentFound != -1) {
        currentFound = json.indexOf("|", lastFound + 1);
        if(currentFound != -1) {
            borders[step] = json.substring(lastFound + 1, currentFound).toFloat();
            lastFound = currentFound;
            step++;
        }
    }
    
    if(currentFound == -1 && step == 0) return -1; // if no | is found, return error.
    
    ds_temp_max = borders[0];
    ds_temp_min = borders[1];
    sh_temp_max = borders[2];
    sh_temp_min = borders[3];
    sh_humi_max = borders[4];
    sh_humi_min = borders[5];

    EEPROM.put(address_tempDS_max, ds_temp_max);
    EEPROM.put(address_tempDS_min, ds_temp_min);
    EEPROM.put(address_tempSH_max, sh_temp_max);
    EEPROM.put(address_tempSH_min, sh_temp_min);
    EEPROM.put(address_humiSH_max, sh_humi_max);
    EEPROM.put(address_humiSH_min, sh_humi_min);
  
    return 1;
}

float average(float * array, int len) {
    float sum = 0.0 ;
    for (int i=0; i<len; i++) {
        sum += array[i];
    }
    return sum / len;
}

void loop(void) {
    byte i;
    byte present = 0;
    byte type_s;
    byte data[12];
    byte addr[8];
    float tempDS, tempSH, humiSH, batterySOC;

    if (!ds.search(addr)) {
        ds.reset_search();
        delay(250);
        return;
    } 

    if (OneWire::crc8(addr, 7) != addr[7]) {
        return;
    }

    type_s = 0;

    ds.reset(); // first clear the 1-wire bus
    ds.select(addr); // now select the device we just found
    ds.write(0x44, 0); // or start conversion in powered mode (bus finishes low)

    delay(1000); // maybe 750ms is enough, maybe not, wait 1 sec for conversion

    present = ds.reset();
    ds.select(addr);
    ds.write(0xB8,0); // Recall Memory 0
    ds.write(0x00,0); // Recall Memory 0

    present = ds.reset();
    ds.select(addr);
    ds.write(0xBE,0); // Read Scratchpad

    for (i = 0; i < 9; i++) { // we need 9 bytes
        data[i] = ds.read();
    }
    int16_t raw = (data[1] << 8) | data[0];
    byte cfg = (data[4] & 0x60);

    if (cfg == 0x00) raw = raw & ~7; // 9 bit resolution, 93.75 ms
    if (cfg == 0x20) raw = raw & ~3; // 10 bit res, 187.5 ms
    if (cfg == 0x40) raw = raw & ~1; // 11 bit res, 375 ms

    tempDS = (float)raw * 0.0625;

    // remove random errors
    if(tempDS > 84.0) {
        tempDS = tempDS_array[size_of_queue-2]; // take the last measured value instead of the error
    }

    // Read values from the SHT10 sensor
    tempSH = sht1x.readTemperatureC();
    humiSH = sht1x.readHumidity();

    // Read values of battery
    batterySOC = fuel.getSoC();
    
    // routine to check a border crash
    EEPROM.get(address_tempDS_max, ds_temp_max);
    EEPROM.get(address_tempDS_min, ds_temp_min);
    EEPROM.get(address_tempSH_max, sh_temp_max);
    EEPROM.get(address_tempSH_min, sh_temp_min);
    EEPROM.get(address_humiSH_max, sh_humi_max);
    EEPROM.get(address_humiSH_min, sh_humi_min);
    
    if(tempDS > ds_temp_max || tempDS < ds_temp_min || tempSH > sh_temp_max || tempSH < sh_temp_min || humiSH > sh_humi_max || humiSH < sh_humi_min) {
        border_crashed = true;
    }

    // read EEPROM
    EEPROM.get(address_counter_force, current_counter_force);
    EEPROM.get(address_counter_admin, current_counter_admin);
    EEPROM.get(address_tempDS, tempDS_array);
    EEPROM.get(address_tempSH, tempSH_array);
    EEPROM.get(address_humiSH, humiSH_array);
    EEPROM.get(address_soc, soc_array);
    
    // move every entry one position back
    for(int n=0; n<size_of_queue; n++) {
        tempDS_array[n] = tempDS_array[n + 1];
        tempSH_array[n] = tempSH_array[n + 1];
        humiSH_array[n] = humiSH_array[n + 1];
        soc_array[n]    = soc_array[n + 1];
    }
    
    tempDS_array[size_of_queue - 1]  = tempDS;
    tempSH_array[size_of_queue - 1]  = tempSH;
    humiSH_array[size_of_queue - 1]  = humiSH;
    soc_array[size_of_queue - 1]     = batterySOC;
  
    // store new values in EEPROM
    EEPROM.put(address_tempDS, tempDS_array);
    EEPROM.put(address_tempSH, tempSH_array);
    EEPROM.put(address_humiSH, humiSH_array);
    EEPROM.put(address_soc, soc_array);
    
    // Need to do this to go in the correct sleep mode: https://community.particle.io/t/electron-sleep-mode-deep-tips-and-examples/27823
    Cellular.on();

    // dont send data every time. Store in EEPROM and send whole array sometimes.
    if ((current_counter_force >= (steps_to_force - 1)) || border_crashed) { // "- 1" because counter starts with 0
        EEPROM.put(address_counter_force, 0);

        Cellular.connect();
        waitUntil(Cellular.ready);
        Particle.connect();
        waitUntil(Particle.connected);
        Particle.process();
      
        // build transfer string
        publishDString[0] = '[';
        publishSString[0] = '[';
        publishHString[0] = '[';
        publishPString[0] = '[';

        for(int i=0; i<size_of_queue; i++) {
            if(i == (size_of_queue - 1)) { // last iteration -> closing brake
                sprintf(publishDString, "%s%.2f]", publishDString, tempDS_array[i]);
                sprintf(publishSString, "%s%.2f]", publishSString, tempSH_array[i]);
                sprintf(publishHString, "%s%.2f]", publishHString, humiSH_array[i]);
                sprintf(publishPString, "%s%.2f]", publishPString, soc_array[i]);
            } else {
                sprintf(publishDString, "%s%.2f,", publishDString, tempDS_array[i]);
                sprintf(publishSString, "%s%.2f,", publishSString, tempSH_array[i]);
                sprintf(publishHString, "%s%.2f,", publishHString, humiSH_array[i]);
                sprintf(publishPString, "%s%.2f,", publishPString, soc_array[i]);
            }
        }
        
        Particle.publish("tempSMS", String::format("{\"d\":\"%s\",\"s\":\"%s\",\"h\":\"%s\",\"p\":\"%s\",\"t\":\"%s\"}", publishDString, publishSString, publishHString, publishPString, String::format("%d", timeToSleep).c_str()), PRIVATE); // data max 255 bytes
        
        Particle.process();
        
        delay(10000); // wait 10sec for new borders
    } else {
        EEPROM.put(address_counter_force, current_counter_force + 1);
    }
    delay(1000); //can be removed with firmware 0.6.1
    System.sleep(SLEEP_MODE_DEEP, timeToSleep - (millis() / 1000));
}

Can someone helps me? I’m very frustrated because I want to have a stable version of my application. At this point, the electrons stop working after some hours or sometimes after 2 days :frowning: Maybe it is a very simple mistake of myself which results a lot of chaos :wink:

Best regards,
Niklas

for(int n=0; n < size_of_queue; n++) {
        tempDS_array[n] = tempDS_array[n + 1];
        tempSH_array[n] = tempSH_array[n + 1];
        humiSH_array[n] = humiSH_array[n + 1];
        soc_array[n]    = soc_array[n + 1];
    }

isn’t that loop supposed only to go to size_of_queue - 1?
But I’d rather go for a circular buffer anyway rather tham having to shift the values through the array.

I’d also suggest you’re using snprintf(buf, sizeof(buf), ...) rather than sprintf(buf, ...) especially when dealing with recursive string building :wink:

4 Likes

Thanks for your response.

isn't that loop supposed only to go to size_of_queue - 1?

Yes, you're right :slight_smile:

But I'd rather go for a circular buffer anyway rather tham having to shift the values through the array.

I had this in my previous versions but the problem is, that I have to save my python app: which value is the recent one. With the queue, it is always the last element. With a circle, I need to publish this information, too.

I still have two String operations in the Particle.publish line. Could this results heap problems? If yes, how can I replace them with char arrays? The publish function needs a String as parameter, isn't it?

What I have changed:

[...]
for(int n=0; n<size_of_queue - 1; n++) {
        tempDS_array[n] = tempDS_array[n + 1];
        tempSH_array[n] = tempSH_array[n + 1];
        humiSH_array[n] = humiSH_array[n + 1];
        soc_array[n]    = soc_array[n + 1];
    }
[...]
for(int i=0; i<size_of_queue; i++) {
    if(i == (size_of_queue - 1)) { // last iteration -> closing brake
        snprintf(publishDString, sizeof(publishDString), "%s%.2f]", publishDString, tempDS_array[i]);
        snprintf(publishSString, sizeof(publishSString), "%s%.2f]", publishSString, tempSH_array[i]);
        snprintf(publishHString, sizeof(publishHString), "%s%.2f]", publishHString, humiSH_array[i]);
        snprintf(publishPString, sizeof(publishPString), "%s%.2f]", publishPString, soc_array[i]);
    } else {
        snprintf(publishDString, sizeof(publishDString), "%s%.2f,", publishDString, tempDS_array[i]);
        snprintf(publishSString, sizeof(publishSString), "%s%.2f,", publishSString, tempSH_array[i]);
        snprintf(publishHString, sizeof(publishHString), "%s%.2f,", publishHString, humiSH_array[i]);
        snprintf(publishPString, sizeof(publishPString), "%s%.2f,", publishPString, soc_array[i]);
    }
}
[...]

I will flash this on my devices and will hope :wink: I will give you a report.

Thanks,
Niklas

Why? If you start building your transfer string from the tail the head value will always be the last to be added to the string.
You just do the indexing like this

  strcpy(buf, "[");
  for(int i=0; i<size_of_queue; i++) {
    strncat(buf, val[(tail + i) % size_of_queue], sizeof(buf));
    strncat(buf, (i < size_of_queue-1) ? "," : "]", sizeof(buf));
  }

Nope, Particle.publish() has also an overload that takes const char*.
So you'd just write

   char pubData[256];
   snprintf(pubData, sizeof(pubData)
           ,"{\"d\":\"%s\",\"s\":\"%s\",\"h\":\"%s\",\"p\":\"%s\",\"t\":\"%d\"}"
           ,publishDString
           ,publishSString
           ,publishHString
           ,publishPString
           ,timeToSleep
           );
   Particle.publish("tempSMS", pubData, PRIVATE); // data max 255 bytes
1 Like

Why? If you start building your transfer string from the tail the head value will always be the last to be added to the string.

Yes your right. I didn't think in this direction. So I will replace the queue with a circle, if I solved the other problems.

I flashed the new firmware without any String on 4 Electrons to run the test. Two works perfectly, until now :wink: wait for another day.

The two others have a new strange issue. When the electron tries to connect to the mobile network and cloud, it fails. It starts flashing green for a long time. After round about 5 minutes, a white flash appears (looks like a reboot) and the green flash continues. In 50% of the cases, after a manual reset or 1-3 of this self-reboots, it starts flashing cyan after 1 minute if green flashing.

These cyan flashes takes 1 minute, then the fast cyan flashing starts for different durations. It is interrupeted by red flashes. Sometimes only one red flash, sometimes many (~10) flashes but never a SOS pattern. After the red flash, it starts cyan flash (fast or slow) again.

I also flashed a simple firmware which do only clear the EEPROM and do nothing else (automatic mode) -> same issue pattern :frowning: So I don't think my software is the root of the errors?

Hope you understand the steps. I tried to make a video but the flashes are bad recorded.

-Niklas

Flashing cyan with red blips sounds like a keys issue. Have you tried running particle keys doctor using the Particle CLI?

1 Like

ScruffR give me the hint that my green later white issue seems to indicate some issue with the power supply. The SOC was ~40%, but maybe this was false because I sometimes get SOC higher than 100% from some Electrons. I have the 2G version of the Electron (SARA-G350).

Following the hints of @will and @ScruffR I did the following:

  1. load LiPo until the red loading led goes off.
  2. run particle keys server (DFU mode)
  3. run particle keys doctor [ElectronID] (DFU mode)
  4. press reset button

Then it flashes green for ~30s, flashes cyan, fast flashes cyan for ~10s, one red flash, fast flashes cyan for ~5s, breaths cyan. :+1: The second and third try works, too.

What could be the reason that the keys get confused? What can I do, that this won’t happen again?

@Niklas Are you using any code to prevent the Electron from running the battery down past 10 - 20% SOC which can prevent various issues that can happen when the Electron runs on a low or dead battery.

@RWB
Yes that is one of my learnings which is now on the to-do list. Maybe this is possible with the onboard power management unit and its functions. But first I have to read the docs.

What I am also wondering about is that a full loaded LiPo only has a SOC of 80-90% and no 100%. I know how hard it is to calculate a SOC but my feelings says that there could be a default options with prevent a loading up to 100% because this could shorten the LiPo’s lifetime. Have somebody made the same experience of an maximum SOC of 80-90%?

Here is the code I'm using after months of testing a Electron running off the 2000mAh battery + a 3w / 5v solar panel outside. You can use the same code to prevent your Electrons from crashing from low battery voltage.

SYSTEM_MODE(SEMI_AUTOMATIC);
//SYSTEM_THREAD(ENABLED);
// This #include statement was automatically added by the Particle IDE.
#include "Ubidots/Ubidots.h"

#define TOKEN "Token Here"  // Put here your Ubidots TOKEN
#define DATA_SOURCE_NAME "ElectronSleepNew"

SerialLogHandler logHandler(LOG_LEVEL_ALL);  //This serial prints system process via USB incase you need to debug any problems you may be having with the system.

Ubidots ubidots(TOKEN); // A data source with particle name will be created in your Ubidots account


int button = D0;         // Connect a Button to Pin D0 to Wake the Electron when in System Sleep mode. 
int ledPin = D7;         // LED connected to D1
int sleepInterval = 60;  // This is used below for sleep times and is equal to 60 seconds of time. 

ApplicationWatchdog wd(660000, System.reset); //This Watchdog code will reset the processor if the dog is not kicked every 11 mins which gives time for 2 modem reset's. 

void setup() {
 //Serial.begin(115200);
 pinMode(button, INPUT_PULLDOWN);  // Sets pin as input
 pinMode(ledPin, OUTPUT);          // Sets pin as output

 ubidots.setDatasourceName(DATA_SOURCE_NAME); //This name will automatically show up in Ubidots the first time you post data. 
 
 PMIC pmic; //Initalize the PMIC class so you can call the Power Management functions below. 
 pmic.setChargeCurrent(0,0,1,0,0,0); //Set charging current to 1024mA (512 + 512 offset)
 pmic.setInputVoltageLimit(4840);   //Set the lowest input voltage to 4.84 volts. This keeps my 5v solar panel from operating below 4.84 volts.  
}

void loop() {
    
FuelGauge fuel; // Initalize the Fuel Gauge so we can call the fuel gauge functions below. 
 
    
if(fuel.getSoC() > 20) // If the battery SOC is above 20% then we will turn on the modem and then send the sensor data. 
  {
   
   float value1 = fuel.getVCell();
   float value2 = fuel.getSoC();
   
  ubidots.add("Volts", value1);  // Change for your variable name
  ubidots.add("SOC", value2);    

  Cellular.connect();  // This command turns on the Cellular Modem and tells it to connect to the cellular network. 
  
   if (!waitFor(Cellular.ready, 600000)) { //If the cellular modem does not successfuly connect to the cellular network in 10 mins then go back to sleep via the sleep command below. After 5 mins of not successfuly connecting the modem will reset.  
    
    System.sleep(D0, RISING,sleepInterval * 2, SLEEP_NETWORK_STANDBY); //Put the Electron into Sleep Mode for 2 Mins + leave the Modem in Sleep Standby mode so when you wake up the modem is ready to send data vs a full reconnection process.  
    
}  
  
     ubidots.sendAll(); // Send fuel gauge data to your Ubidots account. 

     digitalWrite(ledPin, HIGH);   // Sets the LED on
     delay(250);                   // waits for a second
     digitalWrite(ledPin, LOW);    // Sets the LED off
     delay(250);                   // waits for a second
     digitalWrite(ledPin, HIGH);   // Sets the LED on
     delay(250);                   // waits for a second
     digitalWrite(ledPin, LOW);    // Sets the LED off
  
     System.sleep(D0, RISING,sleepInterval * 2, SLEEP_NETWORK_STANDBY); //Put the Electron into Sleep Mode for 2 Mins + leave the Modem in Sleep Standby mode so when you wake up the modem is ready to send data vs a full reconnection process.  
    
  }
  else //If the battery SOC is below 20% then we will flash the LED 4 times so we know. Then put the device into deep sleep for 1 hour and check SOC again. 
  {
      
  //The 6 lines of code below are needed to turn off the Modem before sleeping if your using SYSTEM_THREAD(ENABLED); with the current 0.6.0 firmware. It's a AT Command problem currently. 
  //Cellular.on();
  //delay(10000);
  //Cellular.command("AT+CPWROFF\r\n");
  //delay(2000);
  //FuelGauge().sleep();
  //delay(2000);
  
  
  digitalWrite(ledPin, HIGH);   // Sets the LED on
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, LOW);    // Sets the LED off
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, HIGH);   // Sets the LED on
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, LOW);    // Sets the LED off
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, HIGH);   // Sets the LED on
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, LOW);    // Sets the LED off
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, HIGH);   // Sets the LED on
  delay(150);                   // Waits for a second
  digitalWrite(ledPin, LOW);    // Sets the LED off
  
  System.sleep(SLEEP_MODE_DEEP, 3600);  //Put the Electron into Deep Sleep for 1 Hour. 
  
  }
}    
    

This is normal and has to do with the fuel gauge chip wanting higher charged voltages to read 100%. Normally I get 82-86% when the battery is fully charged to 4+ volts. You can use the scale function to scale the battery SOC from 0-86 to 0-100% if you desire.

2 Likes

Hello @RWB

"The 6 lines of code below are needed to turn off the Modem before sleeping if your using SYSTEM_THREAD(ENABLED); with the current 0.6.0 firmware. It's a AT Command problem currently."

I want to know whether this has already corrected in version 0.6.1?

Not sure, you will have to check the 6.1 firmware notes.

@RWB
Thanks for your information. I will test parts of your code in my system.

@will @ScruffR
My systems are running for two days now. See it’s stable now :smile:
You are all great!

1 Like

@RWB

I just see at this moment what cool stuff you did with your posted code. It helps me very much in my dev process. For example debugging without tinker or the intelligent way of handle connection fails.

Thank you for this :wink:

2 Likes