Excessive number of disconnect events on multiple devices

Robbie · May 10, 2019, 11:05am

We brought this issue to Particle’s attention about three months ago (having realized it was going on since January). Nothing could be found. The issue last week with the cache problems didn’t cause this, but it might have exposed more devices showing this distinct behavior. We have over 20 photons offline, because eventually they go offline and a power cycle is needed. Then we have 90+ photons that show this online/offline behavior sometimes 75-95 times in a two hour period. It is also impacting the ability of our firmware to work as designed.

mstanley · May 10, 2019, 7:01pm

Hey everyone,

We’ll be looking into this more next week now that things are settling down on the UDP front.

Based on reported behaviors, I am noticing a mix of reported behaviors. Some incidents indicate the Photon is actually going offline–whereas in other cases, it appears the Photon is reported offline but still behaving as intended.

I anticipate there are a couple of different issues occurring here. For those whose Photons are still functioning despite reporting offline–I suspect this is just a reporting bug and may not indicate any issues with the device or application code itself. This is only a preliminary assessment, however.

We will be looking into all connectivity and status issues for TCP devices soon, though!

thrmttnw · May 10, 2019, 8:15pm

If it is useful to know: I have a photon on 1.0.0 doing the same thing for weeks, running on a breadboard with nothing connected in order to monitor cloud connection stability.

It constantly reports offline online in the console ex. twice per hour, but the console disconnect counter only counts 7 after more than 24 hours.

mstanley · May 10, 2019, 10:56pm

Hey everyone,

We identified the cause of the online and offline event issue and are testing a fix now. We are shooting to have it released early next week.

In some situations, Photons may incorrectly publish spark/status offline events despite the device being healthy and online.

This issue is an unintentional impact of work being done to improve the average reliability of the online indicator across the Particle Cloud.

We are working on a more public announcement about the online indicator improvements that will include additional details about these events.

arklabs_josh · May 11, 2019, 12:55am

I’m sorry Stan but that doesn’t explain our issues. During this time we are unable to send commands to or receive updates from our Photons. So an “incorrect status” doesn’t really explain away out problem. our devices should be reporting in their measurements and they stop…and then resume after the new online status is updated.

mstanley · May 11, 2019, 1:08am

Hi @arklabs_josh

Please refer to the post just before my latest:

More than one type of issue is being acknowledged here. One issue refers to the status indicator for users whose devices continue to operate while offline. There are other, yet to be determined issues, that are acknowledged to be more than just improper status reporting.

As stated, we will be looking into all status and connectivity issues in the next week.

The status issues were quick to identify and were a quick fix, hence why these are being addressed already. The issues in regards to why devices such as your are actually going offline will take a bit more investigation. The intent is to put focus into this at the start of next week.

We recognize there's still work left to be done. So no worries, this case isn't closed yet. We just need a bit of patience so that engineering has time to dig into this issue in-depth.

Robbie · May 14, 2019, 12:51am

Stan, are there any updates on this major issue?

mstanley · May 14, 2019, 2:15am

It’s my understanding a fix for the online/offline status indicator was tested and is working. I am uncertain yet if this has been rolled out to production yet from stating. If should be soon if it is not already.

Engineering is still investigating issues into other Photon connectivity. As I have more information, I’ll be sure to update.

mstanley · May 14, 2019, 2:45am

Would you be able to send me specific device IDs that are experiencing this issue?

If you are able, sharing your user application would also be helpful in this case. You may feel free to share both with me in a private, direct message.

arklabs_josh · May 14, 2019, 12:02pm

We’ve been working directly with Dave Blevins on this issue. If you can’t get the ID’s from him I’ll be happy to supply them.

ParticleD · May 14, 2019, 7:38pm

I sent Matt the list of devices a little while ago.

mstanley · May 14, 2019, 9:49pm

As mentioned above, Dave was able to share these with me. Much appreciated.

Mjones · May 26, 2019, 2:14am

Has there been any movement on this? I keep getting emails from particle asking if my problem has been resolved, replying doesn’t get me an actual response from them. I still have several devices with several hundred, some with over one thousand disconnect events. My customers are starting to be affected as I have one who can’t use two of their devices.

mstanley · May 26, 2019, 3:13am

Hi Michael. Would you be able to provide me a few sample device IDs so that I can look into these on our end? Implementations have gone out in the past few weeks to handle online status indication as well as a fix for our 5/3 Redis crash incident. I'd like to dig into this a bit and see what might be going wrong here.

Mjones · May 26, 2019, 4:22am

Sent a few of them. I do have others, if you would like them all, I’ll sit down tomorrow and go through them.

Topic		Replies	Views
Photon shows offline, but its really online and running fine Cloud	24	8116	April 30, 2017
All devices offline Troubleshooting	46	1961	October 8, 2018
Photon goes offline – online every ten minutes Troubleshooting	19	3689	July 13, 2016
Online photons getting listed as offline in the cloud when a core is also online Cloud	17	4974	September 27, 2015
When I look at 'events' device constantly going offline coming online Troubleshooting	6	2079	May 23, 2018

Excessive number of disconnect events on multiple devices

Related topics