Particle Boron Uplink Issue in Off-Grid Prototype

Hello everyone,

I'm working on a prototype device for an off-grid application, using Particle's Boron platform. Reliability is critical, as the device is in a remote location and difficult to access.

Here's the setup: The Particle Boron acts as a failover unit, monitoring key functions of a Control Electronics box. During initial testing, the system performed well for the first three days. However, on the third day, I encountered an unexpected issue.

Issue Description:

  • The Boron stopped sending uplink data to my dashboard for about 19 hours.
  • When inspected, the Boron's LED was flashing green (no white or other colors), indicating a problem with cellular network connectivity.
  • Power cycling the device resolved the issue within 15 seconds, and it resumed normal operation.

My Questions:

  1. Potential Causes: What could cause the Boron to suddenly stop sending uplink data and show a flashing green LED? Are there known issues with cellular connectivity or any other factors that might lead to this behavior?
  2. Automatic Reset Solution: What measures can I implement to ensure the device automatically resets or power cycles in case this issue recurs? I am looking for a way to minimize downtime, especially given the remote nature of the installation.

Any insights, suggestions, or similar experiences would be greatly appreciated. This is a crucial aspect of my project, and I want to ensure maximum uptime and reliability.

Device OS 5.6.0
Particle Boron LTE (~2019 edition)

This isn't the most elegant solution (I'd rather find the root cause) but I went ahead and created a watchdog process in my loop to check if the device is connected - if it remains unconnected for 30 minutes the device will reset automatically

void setup() {

if (resetDueToConnectionLoss) {
        Particle.publish("ResetAlert", "Device was reset due to lost connection", PRIVATE);
        resetDueToConnectionLoss = false; // Clear the flag
    }
.....
}
void loop() {

if (Particle.connected()) {
        lastConnectedTime = millis();
    } else {
        if (millis() - lastConnectedTime > connectionTimeout) {
            // Set the flag before resetting
            resetDueToConnectionLoss = true;
            System.reset();
        }
    }

.....

}

@hansA ,

I have run into this issue a few times myself
Sometimes, the Boron can fail to realize that the modem is connected and will continue to "flash green" forever - even if the modem is connected.

I need my devices to recover from these issues autonomously so, I got some advice from Particle that seems to be working well for me. When my devices connect they initially set a 3 minute timeout period. If they don't connect in 3 minutes, they reset then try for 5 minutes and then 7 minutes. This is gives the devices ample opportunity to connect and I have seen connection rates improve in my remote devices.

One caveat, this was the fix about a year ago, it may be that this issue has been resolved since deviceOS@4.0.2 but it is relatively easy to try. I can provide sample code if you like.

Hope this helps,

Chip

3 Likes

Thanks for the quick reply, in my case the device was not connected because it wasn't sending particlePublish events to the particle cloud. I'm curious in your watchdog why do you have various time windows, 3, 5, 7? What happens after the third attempt?

@hansA ,

These were recommended to me by Particle support as many of my devices are in locations where there is poor connectivity. They are also remote so sending someone out to reset the device is problematic.

I also use an external watchdog but it is focused on resetting the device in case the main loop is blocked.

Here is the escalation:

  • Attempt to connect for 3 mins - reset on failure
  • Attempt to connect for 5 mins - reset on failure
  • Attempt to connect for 7 mins - reset on failure
  • Give up attempting to connect for this hourly reporting period. The report is stored in a queue and will be sent once connected. I took this approach so a service outage at Particle or the carrier would not be as disruptive
  • Repeat all the above for the second hourly period
  • After three hours, my back end service will notice the device's absence and trigger an alert to me
  • Also at three hours, the device will perform a power cycle reset for the Boron and its peripherals.
  • Then the hourly and three hourly pattern will repeat
  • After a day of not connecting, my back-end service opens a ticket on the device in Freshdesk which also triggers notification of the client that someone will need to physically inspect the device.

Seems like a lot of but has made the system more resilient to poor connectivity and outages while not wasting too much battery life.

Hope this helps,

Chip

2 Likes

One thing I wanted to add after the stellar advise from Chip.
You can use the new hardware watchdog for resetting your device:

Best

1 Like