Particle.connect() blocking main loop permanently, even with SYSTEM_THREAD(ENABLED)

Yes, Particle.connect() was made blocking in SEMI_AUTOMATIC mode as it always meant to be (but in some versions slipped through to not be) that way (and was documented from the beginning). This is also irrespective of SYSTEM_THREAD(ENABLED) since you are calling the function from the application thread, so the application thread will be blocked till the function succeedes or times out.
One way to prematurely interrupt the ongoing connection attempt is to have a timer (SW or HW) which will issue a Particle.disconnect().

You can read the discussion with Particle regarding this change here
https://github.com/particle-iot/firmware/issues/1399
https://github.com/particle-iot/firmware/issues/1449

The reason for that is that the system internally tickles the Application Watchdog which keeps it from timing out. A discussion about that can be found here
https://github.com/particle-iot/firmware/issues/1382

Unfortunately, Particle.connect() is never timing out though.

Also, it was my understanding from one of your other comments (Particle.disconnect does not interrupt Particle.connect) that doing Particle.disconnect() from a software timer wouldn’t work either. Is that no longer the case?

Thanks for the comments!

In that statement I didn’t really say it would or wouldn’t work, I merely stated the fact that when the docs mention calling Particle.disconnect() from an interrupt then the test case with calling it from a software timer isn’t the same thing and hence the statement may still be true until proven wrong in the meaning of the sentence.

From that statement one can’t deduce whether the primary assertion of the OP that it doesn’t work from software timers is either true or false. And I must admit, I haven’t tested it either.

However if Particle.disconnect() actually didn’t do what it’s meant to do, you could still pull the plug by issuing a WiFi.disconnect() or even a WiFi.off(). That should work in any case.

Gotcha, thanks. That’s actually how I have it implemented now, with a software timer and a few system event monitors to move it back through wifi.off -> wifi.on -> wifi.connect -> particle.connect.

I’ll keep watch on the serial logs to see if/when it gets disconnected, and see if my reconnection logic will interrupt Particle.connect(), and hopefully eventually get reconnected.

I still think there’s some bug here somewhere. I shouldn’t have to jump through so many hoops just to keep this thing connected :). Ideally I’d just set it to automatic mode and let 'er rip.

Wait... is that also true of MANUAL??

Nope, not for MANUAL
But WiFi.connect() will be blocking for the most part in any mode.

2 Likes

@BDub

Please add this to the list of things that need to be clearly and unambiguously documented.

Right–that doc is here:

https://docs.particle.io/reference/firmware/photon/#system-modes

Some excerpts:

Semi-automatic mode

The semi-automatic mode will not attempt to connect the device to the Cloud automatically. However once the device is connected to the Cloud (through some user intervention), messages will be processed automatically, as in the automatic mode above.

Once the user calls Particle.connect(), the user code will be blocked while the device attempts to negotiate a connection. This connection will block execution of loop() or setup() until either the device connects to the Cloud or an interrupt is fired that calls Particle.disconnect().

1 Like

This document does not hint at the behavior that @ScruffR has described about WiFI.connect():

connect()

Attempts to connect to the Wi-Fi network. If there are no credentials stored, this will enter listening mode (see below for how to avoid this.). If there are credentials stored, this will try the available credentials until connection is successful. When this function returns, the device may not have an IP address on the LAN; use WiFi.ready() to determine the connection status.

// SYNTAX
WiFi.connect();

Since 0.4.5 It's possible to call WiFi.connect() without entering listening mode in the case where no credentials are stored:

// SYNTAX
WiFi.connect(WIFI_CONNECT_SKIP_LISTEN);

If there are no credentials then the call does nothing other than turn on the Wi-Fi module.

This document never says explicitly that Particle.connect() does not block while in MANUAL mode:

Particle.connect()

Particle.connect() connects the device to the Cloud. This will automatically activate the Wi-Fi connection and attempt to connect to the Particle cloud if the device is not already connected to the cloud.

void setup() {}

void loop() {
if (Particle.connected() == false) {
Particle.connect();
}
}

After you call Particle.connect(), your loop will not be called again until the device finishes connecting to the Cloud. Typically, you can expect a delay of approximately one second.

In most cases, you do not need to call Particle.connect(); it is called automatically when the device turns on. Typically you only need to call Particle.connect() after disconnecting with Particle.disconnect() or when you change the system mode.

Manual mode

The "manual" mode puts the device's connectivity completely in the user's control. This means that the user is responsible for both establishing a connection to the Particle Cloud and handling communications with the Cloud by calling Particle.process() on a regular basis.

SYSTEM_MODE(MANUAL);

void setup() {
// This will run automatically
}

void loop() {
if (buttonIsPressed()) {
Particle.connect();
}
if (Particle.connected()) {
Particle.process();
doOtherStuff();
}
}

When using manual mode:

  • The user code will run immediately when the device is powered on.
  • Once the user calls Particle.connect(), the device will attempt to begin the connection process.
  • Once the device is connected to the Cloud (Particle.connected() == true), the user must call Particle.process() regularly to handle incoming messages and keep the connection alive. The more frequently Particle.process() is called, the more responsive the device will be to incoming messages.
  • If Particle.process() is called less frequently than every 20 seconds, the connection with the Cloud will die. It may take a couple of additional calls of Particle.process() for the device to recognize that the connection has been lost.

This is heavily misleading:

Under System Threading Behavior,

System modes SEMI_AUTOMATIC and MANUAL behave identically

which is not true, except in this narrow aspect:

both of these modes do not not start the Networking or a Cloud connection automatically

There is nothing that brings the material together and presents it in a cohesive way. There are bits and pieces of information scattered throughout the API reference, but even if you collect all the information in there together, there are enough gaps that you will be led astray even if you're reading carefully.

This is an API reference, but the "Reference Manual" is missing. There is no Theory of Operation discussion or examples showing recommended ways to handle common scenarios-- leaving new users (including experienced engineers) to trial-and-error their way through building an application.

I’m experiencing similar kinds of things with the Electron in areas of poor signal reception. I am however in AUTOMATIC mode. I do call Particle.connect() in code called by loop() if Particle.connected returns false I am also in SYSTEM_THREAD(ENABLED) operation.

I understand that Particle.connect() is blocking in SEMI-AUTOMATIC, but is the same true for AUTOMATIC or does it just set the flag for reconnection later?

I have reset timeouts set for cloud connectivity being off for 20 minutes, but they would only get triggered in loop so my code has a chance to finish what it’s doing, but if Particle.connect() was blocking I could see how that wouldn’t reset anything, though I’d tentatively expect the ApplicationWatchdog to trigger.

@JesusFreke - have you discovered anything further in this?

I haven’t discovered anything further, but after I added my software timer-based timeout for Particle.connect(), which calls Wifi.off(), I haven’t experienced a hang yet. Although, I’m not sure if the timeout has actually been triggered yet. I had to disconnect the serial monitor due to some other development I’m doing.

1 Like

Interesting thread…

Like others here my code runs with SYSTEM_THREAD(ENABLED), and on both Electron & Photon. This mode means the product will function regardless of WiFi credentials existing or a network connection being present but this still leaves the fringe case of a poor signal causing multiple disconnect/reconnects and because of this interfering with the operation of loop() and if I understand correctly by inference anything in loop that uses millis().

This potentially means (for example) a pump stays running flooding a greenhouse, the keypad/display stops working etc. What could be nice if when operating in threaded automatic/semi-automatic mode a callback was made to the application thread before such a reconnect attempt was made, this could give the application chance to notify the user, set that pump to a safe state or other such niceties.

I’ve read quite a few bits and pieces on the forum today but with the changes made over time its not always easy to establish whats current and whats not. I would appear that there are several functions (e.g Cellular.connected) that currently do not quite behave as one might expect making crafting solutions to problems like this … interesting. I tould be nice to have an official blog post/example or just some more detail in the reference.

4 Likes

The partial solution to the problem as you’ve stated is to leave all networking related calls in the context of loop(), and then to make a separate thread or two to manage any tasks that require realtime responsiveness. I wouldn’t necessarily expect your above problems to be addressed in the System firmware anytime, since there are some cases where that is probably preferred behavior (generally is simpler to use if you are OK with blocking your primary thread).

While I personally have had a number of issues with connectivity and freezeups, my IO thread for Serial1 or CAN input has worked flawlessly up until the very moment I trigger a restart on the device. Same goes for my watchdog / reset management thread. You can pretty easily have the main thread be networking only, and move all other tasks to a secondary thread. Obviously, make sure you are using libraries and such that are threadsafe. As an example, the MQTT libraries or anything that uses TCP are NOT fully threadsafe, and must be in the main loop() thread. Anything that smells like networking probably should stay in the main thread, but for example I have my IO and soon my SD card operations running on independent threads.

See rickkas7’s tutorial for more details on threads that aren’t otherwise particularly documented yet

2 Likes

fwiw, for my specific issue, I haven’t had any problems after I implemented a timeout for Particle.connect(), which resets wifi. e.g.

Timer watchdogTimer(CONNECT_TIMEOUT, doTimeout, true);
bool reconnecting = false;

void loop() {
  if (!Particle.connected()) {
    if (!WiFi.ready()) {
      if (!WiFi.connecting()) {
        WiFi.connect();
      }
    } else {
      watchdogTimer.start();
      Particle.connect();
      watchdogTimer.stop();
      waitFor(Particle.connected, 10000);
    }
  }
}

void handleWifiOff(system_event_t event, int param, void *blah) {
  if (reconnecting) {
    WiFi.on();
  }
}

void handleWifiOn(system_event_t event, int param, void *blah) {
  if (reconnecting) {
    WiFi.connect();
  }
}

void handleWifiConnected(system_event_t event, int param, void *blah) {
  if (reconnecting) {
    reconnecting = false;
    System.off(handleWifiOff);
    System.off(handleWifiOn);
    System.off(handleWifiConnected);
  }
}

void doTimeout() {
  System.on(network_status_off, handleWifiOff);
  System.on(network_status_on, handleWifiOn);
  System.on(network_status_connected, handleWifiConnected);
  reconnecting = true;
  WiFi.off();
}
1 Like

Hi there,

I never used watchdogs or interrupts, but am right now facing a similar problem to the one you were having.

My photon works great until there happens some problem with Wifi. Than it tries to reconnect but if that doesn’t work - it looks like it is stuck. After reboot everything is great again.

Could you please give some more information about how your solution works for noobs like me? Didn’t find any „tutorial” for this issue and it would be great if you could evaluate on this:slight_smile:

Do you use SparkIntervalLibrary?

Where in your code do these functions „handleWifiOff” etc. get executed (i don’t see it being called anywhere in setup or loop)?

How long „CONNECT_TIMEOUT” is/should be? I understand that it should be long enough to establish connection?

Is your solution suitable if I want my device to work even if there is no internet connection? I don’t see any System.reset being called.

If providing a more detailed solution or a „working example” instead of pseudo-code, is to much work… please point me in the right dorection since my own research didn’t give right info.

As a first step try SYSTEM_MODE(MANUAL)

Also make sure that your device does not run out of big enough chunks of heap space (e.g. by avoiding to use String).

These instructions (probably also executed during setup()) hook up the respective functions to system events, whenever one of these events occures the system will call the hooked function(s).

2 Likes

I’m still a bit confused as what is said in this thread doesn’t seem to match the docs.

In the docs under System Threads / System Functions it says in SYSTEM_THREAD(ENABLED) mode only the following system functions block the caller:

  • WiFi.hasCredentials(), WiFi.setCredentials(), WiFi.clearCredentials()
  • Particle.function()
  • Particle.variable()
  • Particle.subscribe()
  • Particle.publish()

But the thread above talks about Particle.connect() blocking even with SYSTEM_THREAD(ENABLED). Can someone in the know be specific about under what circumstances (and if specific on what boards) Particle.connect() will not block.

Little late to the game but was hoping for a point of clarification - the docs seem to hint at it, as do most of the forum post, but I figured in case it was just asked differently and I or someone else just wasn’t asking the question the right way etc…

If both SYSTEM_MODE(SEMI_AUTOMATIC) and SYSTEM_THREAD(ENABLED) as in @ScruffR’s post awhile back with this bit of code:

SYSTEM_MODE(SEMI_AUTOMATIC)
SYSTEM_THREAD(ENABLED)
void setup()
{
  //pinMode(D7, OUTPUT); //Wasn't sure if that was here for a specific reason
  Particle.connect();
}
void loop()
{
  //do *stuff* - I simplified :)
}

Does the above mean that if WIFI & Internal nets are up but the interwebs are down, that whatever’s going on in the loop will keep on trucking till the webs come back up - and then automatically re-connect to the cloud?

Or should the code read more along the lines of:

SYSTEM_MODE(SEMI_AUTOMATIC)
SYSTEM_THREAD(ENABLED)

long timeout = 2000;
long timer = 0;

void setup()
{
  //pinMode(D7, OUTPUT); //Wasn't sure if that was here for a specific reason
  Particle.connect();
  timer = millis();
}
void loop()
{
  //do *stuff* - I simplified :)
  if(millis() - timer >= timeout)
  {
    //I might have got the timer wrong, but you know what I'm going for there
    Particle.connect();
  }
}

Or would both work, but the second one would just be redundant?

From what I gather, in this setup, ‘Particle.connect()’ would not block anything because of the threading mode and system mode(?)

FWIW, the docs, while fairly comprehensive, don’t always cover things in a one size fits all kind of way - I for example need to have 5 different people tell me the same thing before I realize I asked the wrong question :smiley:

I think ultimately the reason this line of questioning keeps coming up is that there’s (and maybe I’m off my rocker - I accept this) this perception that this platform is more plug and play than it really is (thats probably true of most things come to think of it…)

For a long time I just could not wrap my mind around a for loop, now I don’t understand why I had trouble with it… I think if I could understand what happened in between, it would open doors to figuring out how to meet people where there at so to speak.

I digress. a lot… thoughts on the more relevant bits?

First, once a connection has been established with SYSTEM_MODE(SEMI_AUTOMATIC) there is no difference between this and default SYSTEM_MODE(AUTOMATIC).
Consequently with your single Particle.connect() in setup() the behaviour of loop() is (supposed to be) exactly the same in both cases: The device OS will try its best to keep the connection stable and in case of a disconnect will try to reconnect.

However, sometimes the automatic reconnection scheme may not always fit your needs and hence you may opt for another layer of control over the connection in your application code.

This question has not got a single answer. It depends on what device OS version you are using.
There were versions where Particle.connect() (in SEMI_AUTOMATIC, SYSTEM_THREADING(ENABLED) mode) wasn't blocking at all, versions were it was mostly blocking and blocking under various conditions. I've lost track of how it behaves or how it should behave especially when the chosen path seems less intuitive and flexible than what was previously available.
Hence I've given up arguing, but always test the behaviour of the device OS version I want to target or opt for the most restrictive interpretation and build the code around that.
Having said that, I've found SYSTEM_MODE(MANUAL) plus SYSTEM_THREAD(ENABLED) to give me most flexibility and predictability of behaviour.

1 Like

Thanks @ScruffR for the insight - at the moment I’m on 0.8.0-rc.11 (I’m a glutton for punishment when it comes to the bleeding edge…), though from what it sounds like, I’m well past those versions in which it wasn’t blocking… even if I wasn’t though, I wouldn’t want to rely long term on a ‘feature’ resulting from a bug (been there, done that :stuck_out_tongue:)

Based on what I’ve read and what you’ve wrote, manual/threading should do what I want, though @rickkas7’s tutorial on it makes me a bit nervous… not that there’s a lot at stake - if it breaks, I build it again - and its not like auto prevents me from getting stuck in SOS mode either, I just hate spending a whole bunch of time learning something and coding something (as a relative novice) only to find out after it worked once and died that I missed a spot and can’t actually do said thing etc. etc. I suppose thats always a risk…

Ok ok, its less convenient to put something in DFU mode when it goes south. what can I say, I’m lazy like that :smiley:

Appreciate you taking the time!

1 Like