Particle Cloud to Electron communication breaks down when network is slow

jaza_tom · February 5, 2018, 1:53pm

We have a few dozen Electrons operating on a 3rd party (Zantel) network off the coast of Tanzania.

They are all running version 0.6.1 of firmware

The Zantel network (and all other networks for that matter) are heavily overburdened in these areas. This results in moderate to severe network latency issues.

It would seem that when the latency is on the more severe end of the spectrum, the Particle Cloud API stops working properly with our fleet.

I have posted about this issue in another thread, but thought that this phenomenon should exist in its own dedicated thread.

Particle.function() calls from cloud to Electron "failing" (but not actually failing)

_
One example of how latency breaks the Particle to Electron communication system is when calling a Particle.function() that has been exposed to the cloud. In my app, I have a cloud function which when called sets a flag that queues a Particle.publish(). This means that I know if a cloud function call is successful because I will see a subsequent Particle.publish() that gets triggered and sent to the Particle servers.

Often (~10% of the time), when I call that function, the Particle console responds (after about 10 seconds or so) saying that the device could not be reached. Then (5-10 seconds later), my device will publish the triggered publish, which means that in reality the function call did go through.

Since my app needs to be able to tell if a function call was successfully executed, and since the Particle API will report that the function did not successfully execute (even though it did), this means that I cannot rely on the Particle API to report whether a function call went through successfully. This means that I have had to program an ACK publish protocol that acknowledges Particle.function() calls. This means a bigger program size (which is a big deal since my program size has already reached its maximum size) and also means more cellular data to support my app using Particle.function() calls.

I think the solution here is to make the Particle Cloud API wait for longer before giving up on a function call. I would recommend 20-30 seconds minimum based on the latency I have been observing.

Is this Particle.function() timeout deficiency something that can be changed for my product at the product level? If not, is this something the the Particle development team is going to address at a system-wide level?
_

Particle.variable get requests "failing"

_
Similar to the problem with calling Particle.function() timing out, many times (~50% of the time) when I try to query the value of a Particle.variable() on one of my devices, the result is an error message saying the device can't be reached. I believe that this is the same problem of the Particle.variable() GET request timing out. I have not been able to gather some Serial debug logs proving that the variable get request made it to the device since I am not in Tanzania at the moment. I will try to do so.

Is this Particle.variable() timeout deficiency something that can be changed for my product at the product level? If not, is this something the the Particle development team is going to address at a system-wide level?
_

OTA updates failing due to network latency

(See link to other thread, above)

@zachary

KyleG · February 7, 2018, 1:12am

Let me ping someone that might be able to help, @rickkas7 or @blave

are you able to assist?

zachary · February 7, 2018, 9:00pm

I’m actually really happy you mentioned this. We often theorize about the possibility of slow connections to devices when trying to optimize for API responsiveness, but we have almost never had a real use case to point to. Thank you!

FYI @ctarwater per our recent conversation on long blocking API calls.

@jaza_tom we don’t currently have the ability to override these timeouts for particular products or devices, but that’s a great feature request. cc @jeiden @jberi

For now, your pub-sub workaround is the way to go. We’ll try to get this on the roadmap in 2018. As far as I can imagine it will be a cloud-only feature, not dependent on a firmware version. Likely something you adjust in the console or for a first MVP just with an API call.

jaza_tom · February 8, 2018, 1:52pm

Awesome!

I’ll post back here if/when I catch my development device exhibiting the function call success/failure phenomenon and am able to invoke the test function. That should tell me if the timeout is due to my function handler or if it is purely a network latency phenomenon.

Topic		Replies	Views
OTA update failing due to network latency timeouts Troubleshooting	9	2373	November 29, 2018
Particle Electron not reachable every 10 minutes Troubleshooting	2	974	April 11, 2016
Delay in 'online' status and failed particle.function Cloud electron	32	2329	January 22, 2020
Particle.publish() and timing (Electron) Troubleshooting	0	1201	November 24, 2016
Function call failed Timed out Troubleshooting	42	5515	January 20, 2017

Particle Cloud to Electron communication breaks down when network is slow

Particle.function() calls from cloud to Electron "failing" (but not actually failing)

Particle.variable get requests "failing"

OTA updates failing due to network latency

Related topics