Most stable Photon fw for WiFi disconnections?

I’ve been mostly using Electrons for a product, and have had no meaningful issues with handling reconnections and bad signal. My firmware trucks along, and any disconnections are generally in the seconds.

I’m putting together a Photon, WiFi variant for a specific location, but am having some consistent disconnections that cannot be resolved until my internal watchdog thread or external watchdog resets the device (6min+). These disconnections happen across all 5 devices on my workbench within 20 seconds of each other, and random times of day or night (though not regularly). I find it likely that a router-driven channel change or DHCP renewal happening at the same time might be the forcing factor for the disconnection, but I would expect devices to be able to recover better.

I use:
Sys FW v0.6.3
System Thread Enabled (handle critical incoming local data IO separately from network)
MQTT (this is over TCP, using TCPClient)
Automatic Mode for connectivity management (and I never mess with WiFi/cloud state)

Based on the release notes for 0.8.0RC firmware and 0.7.0 notes, I’ve gathered that there have been some issues with TCP not being thread-safe in some cases, as well as not handling disconnections well.

Has anyone also using SystemThread & a TCP client found a firmware version that handles WiFi consistently for you? Or any workarounds that help reconnect ? I could always just reset faster if WiFi goes offline, but that feels like a last resort. It seems to reconnect after reset most of the time, but I’m concerned about creating related edge cases by over-constraining that. I’ve also had problems in the past with mixing Automatic Mode and turning things off and on with System Thread enabled, so I’m hesitant to try mixing any manual connection management.

Any and all thoughts are welcome!

I’ve whipped this up for now. It’s based on some code I used to manage reconnects on v0.6.3 in the past.

It might get you headed in the right direction.

3 Likes

Thanks, looks pretty similar to what I have, but I’ll try making the wifi off-on toggling more aggressive to match what you have. Maybe that will be enough of a difference to be workable medium term.

1 Like

I noticed that if I lose Wi-Fi while a UDP socket is open, the Wi-Fi never reconnects. For example

udp.begin ()
lose Wi-Fi
udp.stop ()

will never reconnect. The module will be stuck with a blinking cyan light forever. If this happens though:

udp.begin ()
udp.stop ()
lose Wi-Fi

the module will blink green for a while, reconnect to Wi-Fi, then breathe blue once it’s reconnected to the cloud.

Seems like for a cloud connected device, one of the top priorities should be to keep the network connected and the device connected to the cloud.

I may add nrobinson2000’s code to my code and see if that is sufficient to get it to reconnect. I’m more concerned about staying connected to the Wi-Fi than the Particle cloud in my application. Thanks.

1 Like

I don’t use UDP myself (beyond implicitly on the Electron), but I think you might be able to do some more sophisticated detection of that specific error condition.

If you use the System Event “cloud_status”, that will give you the fastest reaction to the loss of connectivity, generally. Then if you are expecting udp to have been connected up to that point, you can take mitigating action.

When you are experiencing the forever blinking cyan light, have you confirmed that the WiFi is also disconnected? You can check this by either logging the WiFi.ready() result or the WiFi.RSSI(); result. Sometimes I will get that condition when in fact I am disconnected from WiFi, other times it’s a problem with the cloud connection only.

Regardless, I think the below might either solve that problem for you or get you closer to a solution. Using the cloud_status event instead of network_status is more reliable because there is no wifi disconnection event (to my knowledge). Instead, since you are generally using Particle Cloud when available/possible, the particle cloud events are a much better indicator of when internet is actually available and when disconnections occur. Give it a shot and let us know if it helps you out. This code would work best and most safely in conjunction with @nrobinson2000 's code above, to handle the more general case and add the timeout-based stuff.

bool udp_connected = false;
bool upd_wifi_error_flag = false;

void cloud_status_handler(system_event_t event, int param)
{
    switch(param) {
        case cloud_status_connected:
            // need to reconnect UDP since it is now probably broken
            udp_client.begin(LISTENING_PORT);
            udp_connected = true;
            udp_wifi_error_flag = false;
            break;
        case cloud_status_disconnected:
            // need to mitigate lost connection during open UDP case 
            if (!WiFi.ready() && udp_connected) {
                // WiFi is also not connected, so UDP is actually broken
                udp_client.stop();
                udp_connected = false;
                udp_wifi_error_flag = true;
            }
            break;
        default:
            // do nothing
    }
}

setup() {
    System.on(cloud_status, cloud_status_handler);
    // other setup stuff
}

loop() {
    // do stuff 
    if (udp_wifi_error_flag) {
        // let's reset the WiFi connection to mitigate inability to reconnect 
        WiFi.off();
        WiFi.on();
        Particle.connect();
    }
}