I wrote a code to control an water pump, based on feedback from some buoys (monitored by other photon, in a different location/network). Both photons are with the same firmware version (‘stable’ 0.7.0). The photon running code for buoys are ok since installation one week ago(AUTOMATIC MODE), but the one in the field controlling pump keeps blocking execution after some hours running, and only comes back to life if reset by pressing button. I’m using an App Watchdog (to System.reset(), with checkin() call in the end of loop() ). Also, I monitor WiFi.ready() and Particle.connected() to call System.reset() if cloud or wifi is disconnected for some time (180s). I tried code with MANUAL and AUTOMATIC modes but the same happened with both modes. The code is quite simple. I almost dont use delays(), it just have some FSMs, access to EEPROM (sometimes, to store number of reconnections to wifi / cloud and pump activations), Particle.subscribe() and Particle.publish() both at 30sec rate.
Is there any known issue related to that?
Forgot to mention that I tried 2 different photon units, and the internet is not so stable in the pump (differently from in the buoys).
Could you test a device running Tinker, to see it that makes a difference in that location?
There’s a good likelihood that something in your code is causing issues, but without seeing that, it’s nearly impossible to tell.
The code is much simpler than tinker. There are 3 state machines to control what message is published and if pump ir activated or not. Other than that, there are only monitoring of wireless and cloud connections and access to EEPROM. I dont use any while condition or any other blocking statement. I only use while() inside setup():
ApplicationWatchdog wd(120000, System.reset);
void setup()
{
WiFi.on();
WiFi.connect();
Particle.process();
delay(1000);
while (!WiFi.ready())
{
if (millis() > DISCONNECT_TIMEOUT)
{
System.reset();
}
Particle.process();
delay(250);
}
wifiConected = true;
EEPROM.get(EEPROM_CON_WIFI, conexWifi);
EEPROM.put(EEPROM_CON_WIFI, ++conexWifi);
while (!Particle.connected())
{
if (millis() > DISCONNECT_TIMEOUT)
{
System.reset();
}
Particle.connect();
Particle.process();
delay(250);
}
cloudConected = true;
EEPROM.get(EEPROM_CON_CLOUD, conexCloud);
EEPROM.put(EEPROM_CON_WIFI, ++conexCloud);
Particle.publish(pubId, "RESETS:" + String(numResets) + "-" + String(WiFi.SSID()), PRIVATE);
delay(1000);
Particle.publish(pubId, "PUMP_ONLINE", PRIVATE);
}
void loop()
{
currMs = millis();
/*
...
no whiles, no delays, no blocking statements
...
*/
Particle.process();
if (!WiFi.ready())
{
wifiConected = false;
// WiFi.on();
if (currMs - particle_conn_currMs > DISCONNECT_TIMEOUT)
{
digitalWrite(PIN_PUMP, LOW);
delay(1000); // OK, only this delay
System.reset();
}
WiFi.connect();
// delay(250);
}
if (WiFi.ready())
{
if (!wifiConected)
{
wifiConected = true;
EEPROM.put(EEPROM_CON_WIFI, ++conexWifi);
}
if (!Particle.connected())
{
cloudConected = false;
if (currMs - particle_conn_currMs > DISCONNECT_TIMEOUT)
{
digitalWrite(PIN_PUMP, LOW);
delay(1000); // And this one
System.reset();
}
Particle.connect();
}
if (Particle.connected())
{
if (!cloudConected)
{
cloudConected = true;
EEPROM.put(EEPROM_CON_CLOUD, ++conexCloud);
}
particle_conn_currMs = currMs;
}
}
wd.checkin();
}
There are still WiFi.on(), WiFi.connect() and Particle.connect() connection calls, because I have tried to run it in MANUAL mode also (and I kept it there, now its in automatic mode). Inside loop, I have only digitalRead()s and digitalWrite()
s, Particle.publish()'s.
It noticed that it always happened after internet connection issue. But yesterday I noticed it stopped to publish messages right after this: Last Handshake: Jul 21st 2018, 10:42 pm, and there was no connection problem in the pump location.
This is obviously not all your code, or where does DISCONNECT_TIMEOUT
get defined?
You may want to use SYSTEM_THREAD(ENABLED)
and instead of your while()
try waitFor()
and you should not call Particle.connect()
repeatedly while a connetion attempt is still running (same goes for WiFi.connect()
).
Do you use String
inside this block?
/*
...
no whiles, no delays, no blocking statements
...
*/
If so, don’t.
Hi ScruffR, thanks for reply.
Sure, its not all my code. I didn’t include everything since I have 3 FSM which are very simple, but has quite long switch()'s statements. I`m already using system thread enabled. The code begins with:
STARTUP(WiFi.selectAntenna(ANT_AUTO)); // ANT_INTERNAL, ANT_EXTERNAL, ANT_AUTO
SYSTEM_MODE(AUTOMATIC); // MANUAL, AUTOMATIC, SEMI_AUTOMATIC
SYSTEM_THREAD(ENABLED);
As you mentioned the substitution of whiles() for waitFor(), I’ll remove this whiles() from setup() and only testing (if()) the connection conditions at the end of my loop(). Also, the watchdog checkin() will occur at the end of loop(), inside a condition for active cloud connection. I’ll not reset my code by myself, I’ll only ‘log’ the disconnection condition, and the application watchdog should be responsible for resetting code after connection timeout. Also, commented every *.connect() call (yeah, its automatically done in background). Something like this (I changed some var names):
// Particle.process();
if (!WiFi.ready())
{
wifiConnected = false;
// WiFi.on();
// if (currMs - particle_conn_currMs > DISCONNECT_TIMEOUT)
// {
// digitalWrite(PINO_BOMBA_ACION, LOW);
// delay(1000);
// System.reset();
// }
// WiFi.connect();
// delay(250);
}
// if (WiFi.ready())
else
{
if (!wifiConnected)
{
wifiConnected = true;
EEPROM.put(EEPROM_CON_WIFI, ++wifiConnQty);
}
}
if (!Particle.connected())
{
cloudConnected = false;
// if (currMs - particle_conn_currMs > DISCONNECT_TIMEOUT)
// {
// digitalWrite(PINO_BOMBA_ACION, LOW);
// delay(1000);
// System.reset();
// }
// Particle.connect();
}
// if (Particle.connected())
else
{
if (!cloudConnected)
{
cloudConnected = true;
EEPROM.put(EEPROM_CON_CLOUD, ++cloudConnQty);
}
wd.checkin();
// particle_conn_currMs = currMs;
}
// wd.checkin();
Also, why shouldn’t I use String inside loop? Or you mean in the comment block?! Because I wrote this comment block as a ‘substitution’ for my real code. Or should the String object not me used inside the loop()?
I don’ t get it.
The Application Watchdog is automatically checked-in between loop()
and with Particle.process()
Repeated use of String
can - over time - cause heap fragmentation which will eventually lead to problems like inability to connect or system crashes.
If the people you’re asking help from request to see your code, it’s usually a good idea to post it all and let them decide what is, or isn’t, ‘simple’. Though you mention that this code is much simpler than Tinker, the latter is a known working piece of code, whereas yours might not be, which is what we’re trying to find out. “Blink an LED” works just as well, if you want something really simple.
Again, unless there’s something in there you don’t want to share, please don’t ‘hide’ your code because you don’t think it’s important. Let those helping you be the judge of that, since there might be issues in there that you aren’t aware of. The usage of String
, like @ScruffR pointed out, is a good example of that. Something you might deem harmless, but if used improperly can cause the symptoms you describe.
Ideally, if using the Web IDE, give us the “share link” of your project. If not, give us the entire, unredacted code so we can have a look.
I appreciate your help mr. Moors7. I’m re-coding based on your thoughts. I guess that this Particle and WiFi calls may be causing unexpected behaviors. I will assume that the background code system will handle the connection (perfectly) in a automatic manner. Also, I’ll take a deepest look on my logic. In meanwhile, I’ll check if the new code generate any problem with this photon. If needed, surely I can share with you the whole code.
Related to String, I only use it in subscribeHandler() function and inside two Particle.function() functions. As so, I assume it are local objects, and so, destructed once function returns.
Oh, there is also casting to String of some variables inside Particle.publish() call.
That’s not the entire truth. Yes, the object itself is local/automatic and hence will be removed when it gets out of scope, but the object internally allocates dynamic memory on the heap and consequently will run the risk of fragmenting the heap if there are other chunks allocated on the heap while the object lives.
Also each and every temporary String
object that may be created on the fly (e.g. when concatenating, casting, resizing, …) can contribute to the fragmentation.
Next, 0.7.0 seems to have some bug in connection with 802.11g networks which should fixed in 0.8.0-rc.4 and later. You could either one of these RC versions or go with 0.6.3.
I just updated both photons to 0.8.0 RC4. Since yesterday, after I removed connection calls (to cloud and wifi) , the systems doesn’t crashed. Around 20 re-connections to cloud (and 15 to wifi). No problem so far.
Thanks ScruffR and Moors7.