Electron on the move

Here’s a fun one to think about: I have two versions of a product that I am working on: a PoC using Electron and Alpha using Boron. Have about 10 Electrons out in the field under test.

  • A PoC with an Electron was local and working flawlessly. Data transmitted to Particle cloud and then to other services via webhooks once every 5 minutes (default sampling rate, sampling rate could increase to every 30 sec if warranted). This electron PoC is located in a RV trailer

  • Trailer moved to another metropolitan area about 60 miles away. From console, was observed that the Electron was not communicating to the cloud anymore

  • Called the user. Electron was breathing Cyan. Had user reset the Electron (couple of times) and it handshaked its way to breathing Cyan each time.

  • Console revealed that the Electron was not communicating (no events recorded and vitals not updated)

  • Could not debug over the phone so drove to test site (60 miles each way). Observed breathing Cyan.

  • Reset Electron, breathing Cyan again but console did not acknowledge connection

  • Removed power from Electron and re-powered it. Electron started breathing Cyan and this time the console acknowledged that the Electron was connected

  • From vitals after reconnect: Good cellular signal, Operator: AT&T Wireless Inc., Access Technology: 3G. Operator when the PoC is local is also ATT

  • Possibility that during transit, the Electron switched carriers since different cell coverage (did not confirm, just an assumption since I have experienced this myself with mobile phone)

  • Latest Device OS: 1.2.1

So, the questions are

  • Why was the Electron breathing Cyan but the console did not acknowledge that the Electron was connected?
  • Why did the Electron need to be power-cycled? Was there something in memory (connection params) that were kept during reset operations and prevented it from future secure communications?
  • How can one easily predict and prevent this from happening? Not too easy to remove power in the PoC’s case to correct problem - and power should not have to be removed if this is a device that is battery powered and 24/7. Electron F/W can check for transmit error and power cycle (not reset) the device but this is a clumsy patch and added external hardware
  • Is this a bug in the OS F/W or a reflection that Electrons cannot move freely about without having communication errors (roaming?)

This has raised a yellow flag for me since I can’t trust the Electron (and perhaps Boron) to connect reliably when the device is mobile (i.e able to move about). Original design was done using a Photon to minimize connection costs and never saw this issue with the Photon Haven’t done sufficient testing with Boron … testing gets $$$ when communicating via cellular.

Any ideas?

I have reported similar behaviour with Photons not realising connection loss (under some "obscure" circumstances) but 1.3.1-rc.1 did solve that issue (maybe even try 1.4.0-rc.1)

What SYSTEM_MODE() are you running?
Are you using SYSTEM_THREAD(ENABLED)? (you should)
Do you have any software checks for connection stability?
If not, you may want to add some soft-power-cycling to the cellular module when the device cannot actively talk to the cloud (e.g. Particle.publish() an event the device itself has setup a Particle.subscribe()).

While this should not happen nor be required, it's always best to have your own safety net in place.
Currently I can't find it, but @rickkas7 has once provided some snippet to deal with connection problems and make any project more resilient against lost connections.

Something I do is that if my firmware detects network related issues, I perform a full modem power down via the firmware and then self-reset to bring it all back online. I’ve found that it fixes many cellular issues. That said, it seems like in a small number of cases, I struggle to get that resiliency without a watchdog power reset as well.