Check cloud or ethernet connection before publish

I am having an issue where my devices loose their connection to the cloud - rapidly blinking cyan. This happens sometimes at the same time on all devices . I suspect this is during a publish event.

I am using the ethernet featherwing with the following:

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(SEMI_AUTOMATIC);

then in my setup:

	System.enableFeature(FEATURE_ETHERNET_DETECTION);

	WiFi.off();  // this is an argon device
	Ethernet.on();
	Ethernet.connect();
	Particle.connect();
	waitFor(Particle.connected, 1000); 

	client.connect(MQTT_CLIENT_NAME, "xxx", "xxxx"); // connect to mqtt server

then in my loop

  if(currentMillis - previousMillis > Publishinterval) {
  previousMillis = currentMillis;

    if (Particle.connected()) {
        publishQueue.publish(eventName, payload, PRIVATE, WITH_ACK);
      }

    // send to mqtt broker
    if (client.isConnected()) {
      client.publish(eventName, payload);
      }  

  }

I looked up the device connection status on my pfsense router and its showing as offline. Which leads me to believe it may not be a cloud connection issue?

Should I be checking the ethernet connection and/or the particle cloud connection before pushing data to the cloud and mqtt server? here is what I am thinking of doing now in my loop. I’d appreciate some help to clean this up - perhaps add the check for ethernet connection and also merge it all into a nice if else non blocking statement?

// this keeps the mqtt connection live
  if (!client.isConnected()) {
  client.connect(MQTT_CLIENT_NAME, "xxx", "xxx");
      } else {
      client.loop();
  } 
 // check cloud connection
  if (Particle.connected() == false) {
    Particle.connect();                   // will this block loop?
  }

  unsigned long currentMillis = millis();

  if(currentMillis - previousMillis > Publishinterval) {
  previousMillis = currentMillis;


    if (Particle.connected()) {
        publishQueue.publish(eventName, payload, PRIVATE, WITH_ACK);
      }

    // send to mqtt broker
    if (client.isConnected()) {
      client.publish(eventName, payload);
      }  

  }

If you are losing cloud connectivity (fast blinking cyan) on publish on Ethernet, the most likely cause is that you need to lower the keep-alive using Particle.keepAlive. Try setting it to 30 seconds and see if the problem goes away.

When a Gen 3 device connects to the cloud it uses DTLS over UDP, and your network router/firewall sets up a temporary port mapping to allow return packets from the cloud back to the device. How long this temporary port mapping stays active is site-dependent.

If this is the problem, then it probably won’t help to check if Particle.connected is true, because it will be true until the publish is tried and fails, because the port mapping went away. Likewise, you probably still have Ethernet link (Ethernet.ready), so that won’t help either.

1 Like

hello @rickkas7 I will use the Particle.keepAlive bit and report back.

But wondering why the device looses connectivity entirely, including to the local mqtt server. Isn’t the keepAlive only related to the connection to the cloud?

It depends what else you have in your loop. If you have any other Particle.publish() calls in the loop itself (the async one is OK), calls like Cellular.RSSI() or Cellular.command(), those calls will block when the cloud connection is lost, which will then cause the loop to stop running so your MQTT code would not run.

@rickkas7 I have nothing like that going on in my loop ( see below ). And some devices just went offline again after setting the keepAlive. I am wondering if the udp timeout states on my pfsense firewall is also contributing to this. Its currently set to:

udp.first                    60s
udp.single                   30s
udp.multiple                 60s

I am going to set it to below, in addition to setting keepAlive to see what happens.

udp.first                   300s
udp.single                  150s
udp.multiple                900s

my loop

int counter = 0;
char buf[64];

void loop() {

  MeasurementData sample;

  // SHT31D
  readTemperatureHumidity(sample);
  // SGP30
  setHumidityCompensation(sample);
  readAirQuality(sample);
  readBaseline(sample);
  refreshBaseline(sample);


  char payload[512];
  float tempF = (sample.temperature* 9) /5 + 32;
  int readtime = Time.now(); //Unix Format

  snprintf(payload, sizeof(payload),"{\"readtime\": %ld000,\"deviceID\":\"%s\",\"deviceLocation\":\"%s\",\"deviceName\":\"%s\",\"deviceType\":\"%s\",\"tempC\": %.1f,\"tempF\": %.1f,\"relative_humidity\": %.1f,\"tvoc\":%.1f,\"eco2\":%.1f}",readtime,(const char*)System.deviceID(),deviceLocation,deviceName,deviceType,sample.temperature,tempF,sample.humidity,sample.voc,sample.co2);


  if (millis() - step_timer > 4000) {
  step_timer = millis();
 
    oled.clearDisplay();
    oled.setCursor(0, 0);

    switch(step) { // Run the current "step"
      case 0:
        oled.setCursor(0, 0);
        oled.println(Time.format(Time.now(), "%a %b,%e"));
        oled.setCursor(0, 18);
        oled.println(Time.format(Time.now(), "%l:%M %p"));
      step = 1;	 // Change "step" to the next step
      break;

      case 1:
        oled.setCursor(0,10);
        snprintf(buf, sizeof(buf), "%.1f ", tempF);  // temperature
        oled.print("Temp "); oled.println(buf);
      step = 2; // Change "step" to the next step
      break;

      case 2:
        oled.setCursor(0,10);
        snprintf(buf, sizeof(buf), "%.1f ", sample.humidity);  // humidity
        oled.print("RH "); oled.println(buf);
      step = 3;  // Reset "step" to the first step to start over
      break;

      case 3:
        oled.setCursor(0,10);
        snprintf(buf, sizeof(buf), "%.1f ", sample.voc);  // total volatile organic compounds
        oled.print("tVOC "); oled.println(buf);
      step = 4;  // Reset "step" to the first step to start over
      break;

      case 4:
        oled.setCursor(0,10);
        snprintf(buf, sizeof(buf), "%.1f ", sample.co2);  // co2
        oled.print("eCO2 "); oled.println(buf);
      step = 0;  // Reset "step" to the first step to start over
      break;

      }
    oled.display();
  } 

  // this keeps the mqtt connection live
  if (!client.isConnected()) {
  client.connect(MQTT_CLIENT_NAME, "xxx", "xxxxx");
      } else {
      client.loop();
  } 
 // check cloud connection
  if (Particle.connected() == false) {
    Particle.connect();                   
  }

  unsigned long currentMillis = millis();

  if(currentMillis - previousMillis > Publishinterval) {
  previousMillis = currentMillis;


    if (Particle.connected()) {
        publishQueue.publish(eventName, payload, PRIVATE, WITH_ACK);
      }

    // send to mqtt broker
    if (client.isConnected()) {
      client.publish(eventName, payload);
      }  

  }

}

During the day, a wi-fi connection can disconnect / reconnect several times. You need to give the Particle.connect() statement time to accomplish its task before moving on in the loop().

instead, try this:

    // check cloud connection
    if (Particle.connected() == false) {
      Particle.connect();   
      //  After you call Particle.connect(), your loop will not be called again until the device finishes 
      //  connecting to the Cloud. Typically, you can expect a delay of approximately one second.
      waitFor(Particle.connected, 60000);
    }

hello @rickkas7 @robc

I implemented the suggested changes above and wanted to give it a couple of days to see it made a difference. I am still experiencing the random device restarts. All the devices occasionally go into a rapid cyan blink and then reconnects sometimes or don’t reconnect at all in some cases. I also now observed the occasional flash of red in between the rapid cyan. It looses connectivity, even on the local lan and doesn’t publish the mqtt payload.

I have done the following:

  1. increased the keepalive settings on my pfsense router from defaults to something more generous.

  2. changed the code as suggested by @robc

 // this keeps the mqtt connection live
  if (!client.isConnected()) {
  client.connect(MQTT_CLIENT_NAME, "xx", "xxx");
      } else {
      client.loop();
  } 
 // check cloud connection
  if (Particle.connected() == false) {
    Particle.connect();
    waitFor(Particle.connected, 60000);                   
  }

I also dont have any of the things in my loop suggested by @rickkas7 to cause blocking. Here is the entirety of my code ( github gist ) . Hope I can resolve this.

ps @robc I am using the particle ethernet featherwing, not wifi.

Hey, sorry, I missed that.