Failures on handling WiFi reconnects ( v0.4.5 / v0.4.6 )

I’ve been having some issues with my photon reconnecting to WiFi if it is lost - I’ve been testing with both v0.4.5 and v0.4.6 in MANUAL mode.

I want to post my outcomes here to see if anyone else has noticed this and/or can help in finding a solution / workaround.

I call the following code (snagged from @indraastra - thanks) from my loop (my loop is outputting serial data every second to show its alive):

if (!Particle.connected()) {
    if (!WiFi.ready()) {
      if (!WiFi.listening() && !WiFi.connecting()) {
        Serial.println("Connecting to network!");
        WiFi.connect();
      }
    } else {
      if (!_cloudConnecting) {
        Serial.println("Connecting to cloud!");
        Particle.connect();
        _cloudConnecting = true;
      }
    }
  } else {
    if (_cloudConnecting) {
      Serial.println("Connected to cloud!");
    }
    _cloudConnecting = false;
    Particle.process();
  }

v0.4.5 Happy Path

Most of the time, I get a happy path:

  • Device breathing cyan, running as normal, loop’s serial output showing every second
  • Kill the AP, device begins to flash green, loop output shows every 10secs (assuming WiFi.connect() blocking here?)
  • Restart AP, device flashes more rapidly green, changes to cyan, starts breathing again, loop output returns to 1 sec.

v0.4.5 Sad Path

Sometimes (it can be after 10 succesful reconnects, sometimes 2, its random) I get the sad path with leads to complete failure:

  • Device breathing cyan, running as normal, loop’s serial output showing every second
  • Kill the AP, device begins to flash green, loop output stops completely as if device has hung.
  • Restart AP, device flashes more rapidly green but never recovers until reset.

Its worth noting I get the sad path about 10% of the time but still enough to worry me as I go to deploy my application…

As mentioned I also tried this with v0.4.6 to see if the separate System Thread being enabled would help me. Here I see different behaviour but I am seeing a ratio of about 50-50 between the Happy and Sad…:

v0.4.6 Sad Path 1

  • Device breathing cyan, running as normal, loop’s serial output showing every second
  • Kill AP, device flashes green as if its on a happy path, running loop output every second.
  • Restart AP, device starts to flash rapidly as if it is about to succeed and then SOS’s - seems to be the SOS pattern plus 1 single blip which the documentation says is #1 Hard Fault.

v0.4.6 Sad Path 2 - [ Note I have only observed this one twice in total ]

  • Device breathing cyan, running as normal, loop’s serial output showing every second
  • Kill AP, loop output continues fine but device goes to breathing blue on the LED
  • Device will not ever connect until a reset.

v0.4.6 Happy Path

  • Device breathing cyan, running as normal, loop’s serial output showing every second
  • Kill the AP, device begins to flash green, loop output shows every 1sec
  • Restart AP, device flashes more rapidly green, changes to cyan, starts breathing again

Completely understand that v0.4.6 is just new and introduces a huge piece of functionality in the System Thread so I appreciate there may be issues, just wanted to share my experience in case anyone can advise / help!

1 Like

Just a quick one:
Are you regularly dropping out of loop() too?
If not, you might want to call Particle.process() even when not yet connected to the cloud.
At least pre 0.4.6 some WiFi only jobs were done in there too, if you didn’t drop out of loop().

Particle.process() was pruned back quite a bit in 0.4.6, so this might not help there, but it doesn’t harm either.

1 Like

Thanks I will try that in my testing and see if it helps :+1:

Issues like this are exactly why I filed a request for some canonical code to replace my connection flow with, since it hasn’t been thoroughly tested until your efforts here. I just ran your test and 100% of the time so far, I end up on something similar to your v0.4.6 Sad Path 1, except the device SOSes as soon as the AP is killed. FWIW, nothing terrible happens when I simply disconnect the AP from the internet.

For anyone building from develop branch, I pushed a fix that addresses the SOS issue with wifi disconnect. Please test and report back the hopefully positive improvement!

1 Like

I have a Core that is exhibiting similar behavior. It was running fine for weeks on the old 0.3.x firmware. Now it can sometimes go hours or go minutes before it stops. Whenever I (or my coworkers) check it, it’s either blinking green or blinking cyan. A quick press of the reset button gets it back online until it chokes again. Unfortunately, I have done much testing to see if I can hunt down the problem or try any of the system modes.

For everyone following, there’s a Github issue tracking this or something very similar here:

@mdma . I recompiled this code Photon is not running or maintaining its connection to WiFi for more than 20 hours on develop 2-3 hrs ago and I’ve had no red flash resets since (on 0.4.6 it occurred up to a few times per hour, though was fine on 0.4.5)

2 Likes

Thanks @mdma, will try later and report back.

Replying here as well as github as I'm not sure where best to get you...

@mdma I tested that earlier and got no SOS - looks like you got that.

Have you got a PR or commit for that fix anywhere? I'm not sure I want to run Develop in the field this week so I might run a local 0.4.6 with that fix until 0.4.7 comes out. Or will that fix be going onto the 0.4.6 release branch for an 0.4.6.2?

Can also confirm, wearing my test router’s reset switch raw and no SOSes so far.

2 Likes