Mysterious Resets on Photon and Electron

@aguspg Sweet!

So the new library has been written specifically for the Electron to minimize data usage and the same library is now the default library for the Photon also?

@aguspg: just to be clear, the old library used HTTP and not UDP? Also, I second @RWB in that Iā€™d like to know what the device-cloud protocol differences are between Photon and Electron.

@BobG yes, the old library used HTTP, the new one is TCP.
@RWB yes itā€™s the same library for both the Electron and the Photon. What itā€™s doing in the background, is opening a TCP connection, sending the request data into this custom format, and then closing the connection:

Particle/1.0|POST|your-token|core-id=>variable_name:value$context=context_value,variable_name:value$context=context_value,variable_name:value$context=context_value|end

For example, a device sending 2 variables would take 105 bytes per update:

Particle/1.0|POST|xxxxxIfa29Rk1mthb7CUAmFWXPJkBg|1a0035000347343138333038=>temperature:22,humidity:33|end

In the future, we could further optimize this with binary format, or MQTT, to save even more traffic.

@aguspg: Thank you. This information is very helpful, albeit the device reset mystery remains a mystery. I am wondering: is this data sent because of a Particle.publish(), or in response to a REST request to the Cloud for the values of cloud variables, or return from a cloud function (or both)? I am asking because there is another mystery that I posted on another thread. Twice I encountered a situation where my Electron was reported to be online and where the Particle Cloud exposed the deviceā€™s Particle.variable()s and Particle.function()s, but a REST request for a function execution or for the valuable of one of these variables returned nothing. The Electron is at a remote site, so the first time that I encountered this, I forced the device to reset by flashing code to it OTA and then it was working correctly (but I lost data stored in a buffer in RAM). The second time that it happened, I got into the site and observed that the device was breathing cyan, that the battery was fully charged and the external power OK, and that the Electron was running my code just fine (I know, because I have the D7 LED blinking some diagnostic information when I exercise the system), but it was not responding to REST requests for any cloud variables or function calls. I manually reset the Electron (via the RESET button) and all was well again (losing the usual buffered data of course).

I donā€™t believe in coincidences, so it is logical to assume that these mysteries are related - that something happened to the Particle Cloud over the period of about one week that caused all of this mysterious stuff to happen. I note that the RESETS occurred on both Photons and Electrons, but the inability to access cloud variables and functions happened only on an Electron. So perhaps the protocol change will help with the latter problem, but perhaps it was some server issue that is now resolved (hopefully). I am happy to report no further issues since this reset on the evening of 3/28.

This is a known issue and has been reported and answered in the same way all over the place:
Since the Electron uses the connectionless UDP transport the online-testing strategy that worked for the WiFi devices doesn't work for the Electron and a new scheme hasn't been activated yet.
So don't "rely" on the status report (sometimes it even fails for WiFi devices).

The thing about the mystery resets has also been reported to Particle guru @bryce for a log check, but I haven't heard anything back yet.

1 Like

@BobG: What timezone is that time posted in? If you PM me your device ids I can filter the logs even more to see what is going on.

@chipmonk Can you PM me a device id or two?

@aguspg: It is important to note for the Electron that data usage includes the IP and TCP overhead in addition to the payload data. For TCP, there is a three way handshake up front in addition to acknowledgements that drastically increase the data usage.

1 Like

@bryce: I am in the Pacific Timezone. I PMā€™ed you my device IDs.

I did some digging and I definitely see connection resets around that time frame. The device will reset itself if it cannot connect to the cloud several times in a row, Iā€™m guessing this is what happened. The root cause of the connection resets is not known yet, but likely load related which we are addressing today by adding more capacity.

I apologize for the lack of detailed information here, we are constantly working to increase visibility into our systems so we can understand root causes when issues like this happen.

The Electron is more resilient to these kind of issues since there is no persistent connection. The cloud can actually reset without the device knowing, as long as the device is not trying to communicate. Sessions and state information is restored so the Electron is none the wiser.

2 Likes

Bryce,

thanks for looking into this issue.

ChipMonk

You're right @bryce, thanks. We'll do some tests with TCP to measure the actual data consumption.

The library doesn't use Particle.publish() or REST requests. It was written from scratch to keep it as light as possible and for us to be able to provide support for it.

@bryce: thanks for reporting this. My Electron did reset at this time, as well. I also suspect that the previously reported lack of access to cloud functions and variables was also a server issue - perhaps capacity related too. In any event, I have not seen and cloud problems since 3/28 and this is very encouraging.