Particle cloud disconnects every 2 minutes

I just started using Particle Cloud today. Previously I was using my own MQTT connection to AWS and not connecting to Particle Cloud at all. I need to switch my devices to using Particle Cloud until I resolve a bug in my own code.

I’m using SYSTEM_MODE(MANUAL) and a third party SIM (Hologram) on Boron LTE. In setup() I connect to cellular, and then call Particle.connect(). That works fine, and events publish as expected. Approximately every 2 minutes, I get this error:

0001125075 [comm.protocol] ERROR: Event loop error 1
0001125075 [system] WARN: Communication loop error, closing cloud socket

The Boron rapidly flashes cyan and reconnects. I have keepalive set at 100 seconds, am using SYSTEM_THREAD(ENABLED), and do not call Particle.process() anywhere in my code.

My loop runs quickly, except for when it’s reading sensors, and then it takes about 3-4 seconds to complete. Sensors are read every 20 seconds, and the disconnects are very regular, at 2mintues and 10-15 seconds apart.

I doubt this is expected behavior. Any ideas what would cause this? I haven’t found anything yet on the forum that seems similar, but I’ll keep searching.

Screenshot from the console;

EDIT: Forgot to say I’m using DeviceOS 1.4.0.

@picsil, with 3rd party SIMs you need to set the KeepAlive to a short time, most likely less than 2 minutes as you have observed. The default for Particle partners is 23 minutes and some 3rd party providers need values down to 30 seconds!

I set keepalive to 60 seconds, and still got the disconnects, and worse, it stops publishing completely. Events don’t publish and it stops responding to pings. The device is still working, as I’m watching the serial monitor. Particle.conneted() still returns true even though it’s unresponsive. It’s breathing cyan when it is unresponsive. I’ve also disabled the SYSTEM_THREAD and started calling Particle.process() in my loop. No change in behavior.

I’ve just set the keepalive to 30 seconds. I’ll know in about 30 minutes if it still has the same behavior.

Should keepalive be set before or after Particle.connect()?

@picsil you should set KeepAlive as soon as possible in your code since it has to do with the cellular connection to maintain the link with the Particle Cloud. Also, you may want to upgrade to 1.4.2 and test with just Tinker running. If that doesn’t drop the connection then add SYSTEM_MODE(MANUAL) and your connection code to see if that works.

Thanks. I’ll give that a try.

I made a few changes that seem to have helped. I’m not sure which one helped, or if it’s a combination. Not using the system thread now, set keepalive time to 30 seconds, set keepalive before cellular connection is made. I’ve now had three successful publishes 20 minutes apart without any cloud disconnects. I’ll keep monitoring.

I was planning on deploying this batch of devices on Monday, so it looks like I’ll be delayed in order to test a bit longer. It’s unfortunate (and completely my fault) that I didn’t catch my bug sooner. This does give me some hope that I can delay the deployment by a day or two, and not indefinitely. The devices are going to be installed 500 miles away, so I want to be sure they’ll work consistently and accept OTA updates. I’m not entirely comfortable with only testing for a day, but I can’t delay much longer. If I have to make another drive out to the pilot site soon so be it. Hopefully this fixes things in the short term.

1 Like

I found a similar thread on the Hologram forum, confirming that their UDP timeout is 60 seconds. The device can still send UDP packets after the timeout, but can’t receive them. I need to figure out a good way to test that out, and not feed my hardware watchdog if it doesn’t have bidirectional communication.

1 Like