Long Term Photon Connection Stability


#122

I’d be curious if that helped in your case too, please report back :+1:


#123

Not sure if this will help but I wanted to allow 1 min for Particle.connect() to work before putting the Photon back to sleep. But you can run any code you want if Particle does not connect within your preset time.

void loop()
{

Particle.connect();

if (!waitFor(Particle.connected, 60000)) {

System.sleep(SLEEP_MODE_DEEP, 300);

}


#124

@ScruffR
What I have done is removed SYSTEM_MODE (MANUAL); and left the cloud connection logic on system and as I am using SYSTEM_THREAD(ENABLED); , so that my code loop() does not get blocked by cloud connection logic.

Hope this fix my problem.


#125

Hi, Now my photon is breathing cyan but is offline in my dashboard and android Particle app.
Last connection time Feb 28th 2016 at 6:56am me from India.

And one more thing Particle.connected() is returning true coz then only my photon gets connected to my MQTT broker and its connected right now.

EDIT: I tried flashing firmware OTA and it worked, seems photon is connected but dashboard and android app not showing proper state.


#126

@ScruffR Still Particle cloud connection auto retry not working.


#127

I’ve lost track of what your code would look like after all your changes.

But I’d either let SYSTEM_THREAD(ENABLED) take care of all the cloud connection stuff and merely check for WiFi.ready() and/or Particle.connected() to avoid trying things that are bound to fail if not.

On the other hand, if I wanted to take care of the cloud/WiFi myself, I’d not use SYSTEM_THREAD(ENABLED).


#128

I’ve been using this

SYSTEM_MODE (MANUAL);
SYSTEM_THREAD (ENABLED);

and

if (!Particle.connected()) {
  if (!cloudConnecting) {
      Serial.println("Connecting to cloud!");
      Status::SetDeviceStatus(DEVICE_CLOUD_CONNECTING);
      Serial.println("Particle.connect()");
      Particle.connect();
      cloudConnecting = true;
  } 
} 
else {
  if (cloudConnecting) {
    Serial.println("Connected to cloud!");
  }
  cloudConnecting = false;
}

This manages my connection with complete success.

I have also seen the problem you have outlined with green flashing LED. I found it was my use of a UDP socket that was causing it - I had to add a timer to rebuild my socket 5 seconds after a reconnection to the network.

This problem has been outlined here if you want to have a read.


#129

@mhazley, can it be that your else block might be a bit misleading?
It will report Connected to cloud! although obviously Particle.connected() must be false to get there.

I guess you just set the curly braces wrong in this snippet.


#130

Thanks @ScruffR - total bracket fudge there :smile:

Fixed now.


#131

Have you taken down your document? I would be interested to see the details you outlined.

Thank you


#132

Just fixed the link, should work now :slight_smile:


#133

I’m still having trouble keeping my little fleet of 27 devices continuously connected. These devices take measurements from sensors and then send the measurements to a 3rd party server over TCP, every 15 seconds. The devices are all connected to a corporate WiFi network and have static IP addresses.

I upgraded all the devices to 0.4.9 two weeks ago. Since the upgrade, 4 devices have lost their connection to the Internet. I say this because they stopped streaming data to the 3rd party server, they show up as offline with particle list in the CLI, and when I try to upgrade their firmware it’s not successful. They all went down at different times. Unfortunately it’s difficult to get physical access to these devices, so I can’t see their LEDs.

One thing that puzzles me is that I have a piece of code to reset the devices if they are disconnected for more than 2 minutes. I’ve seen it work when I test it at home, but for some reason it’s not reliably saving my fleet from disconnection. The disconnected devices connect again right away on a hard power cycle, so I would think that a soft power cycle would also help. Here’s the code:

void loop() {
  static unsigned long lastRunTime = 0;
  unsigned long time = millis();

  // Contents of this if statement are run every 15 seconds
  if(time - lastRunTime >= 15000) {
    lastRunTime = time;

    // Make sure the device is still online
    checkConnection();

    // ... remaining code to read sensor values and send to 3rd party server
  }
}

void checkConnection() {
  // reset device after this many seconds pass without a connection
  static const int reconnectTimeSecs = 120;
  static long lastFailTime = 0;

  long currentTime = millis();
  bool connected = Particle.connected();

  if(connected) {
    // connected, all is well
    lastFailTime = 0;
  }
  else if(!connected && lastFailTime == 0) {
    // connection has just gone down, note the time
    lastFailTime = currentTime;
  }
  else if(!connected && ((currentTime-lastFailTime) >= (reconnectTimeSecs*1000))) {
    // connection has been down for too long
    System.reset();
  }
}

Any advice is welcome - thanks in advance!


#134

A possible reason for your reset code not triggering is that the device still is connected to the cloud.
particle device list, the dashboard or Build are no relyable indicator.
If you have a Particle.function() or Particle.variable() you might stll be able to access them via CLI to check if they are still alive.


#135

Thanks for the response - I think I tried Particle.function() a couple months ago when I was trying to troubleshoot this issue, and got the “Timed out” response. But I’ll try it again with the latest firmware next time a device goes down.

The biggest problem for me is that the device stops sending data to the 3rd party server. The fact that this always coincides with the device becoming unreachable via Particle tools, and that other devices on the same WiFi continue to send data to the same server, makes me think that I should look for a problem with Internet connectivity on the device.


#136

One thing to note for Particle.function() vs. Particle.variable() is that variables are serviced via system code while functions are user code.
At least with SYSTEM_THREAD(ENABLED) this means that functions can be stalled by blocking user code, but variables will still be serviced.


#137

Another one of my devices just went down. I confirmed that I can’t reach it through one of its exposed functions, I get the “Timed out.” response. Unfortunately I don’t have any exposed variables, so I can’t check on that at the moment.

I can update the firmware to start using SYSTEM_THREAD(ENABLED) and to expose a variable, if you think that would help track down the source of the problem. It will take a few days to do that because I need to test the firmware thoroughly before deploying it.


#138

One way for your device to test its connection status and have it report what it thinks it is, might be a Particle.publish("pingMyself", String::format("LastHeard %s", lastHeardString), 60, PRIVATE) and “self subscribing” to it. In the subscription handler you set a timeout var (e.g. millis()) and the lastHeardString and check for that in your reset function too.


#139

Using system threading would be a good start - that should keep your application running if the system becomes blocked.

Current theory is that the WiFi.connect() call is blocking indefinitely, possibly due to a bug in WICED firmware. I will add a system timer to cancel the connection after 1 minute, which should help free the device from this lockup state.


#140

Thanks @ScruffR and @mdma. The theory about WiFi.connect() blocking would neatly explain the symptoms that I’m seeing. I’ll start by switching to system threading and if that doesn’t help I’ll explore some of the other ideas. @mdma, feel free to message me if there’s anything I can do to help you track down the bug you’re chasing.


#141

Thanks. I intend to add a timeout to the WiFi.connect() call this week. https://github.com/spark/firmware/issues/893