How to retain control of an Electron using Cellular.on and Cellular.off

Hi,

In my application I need to be able to turn on and off the Cellular connection. I can never get it to work reliably. So, I've narrowed the code down to a fairly small easy to understand program that should just turn Cellular on, connect, and turn it off. The problem I am having is that in the absence of a reliable cellular tower to connect to, the behavior of the Electron is uncontrollable, and hangs. That I cannot tolerate. I need it to fail gracefully, even if that failure takes many minutes, I need control back.

Below is the code I've been using for testing. It seems each time I run it by hitting the reset button, I get different behavior.

Behavior A:
The most common behavior is to get caught forever in "Cellular.connect()" The output looks like this:

Connect in 2
Connect in 1
Connect in 0
Connect
Cellular.on()
On complete in 7350
Cellular.connect()

The green light blinks forever, or at least for the hours I've left it blinking.

Behavior B:
It gets through "Cellular.connect()" and then prints some giberish like this:

Cellular.connect()
Connect complete in 39492
Ceoff
...connection lost to /dev/tty.usbmodem641 ...

The LED goes green breathing, and then it hangs and never prints out a thing. If you look at the code, the line that says "Ceoff" should say "Cellular off". Somehow it looks like memory was corrupted, or calling Cellular.off while the Serial output buffer was still being sent screwed that up. The "connection lost" line comes from my terminal emulator which noticed that the USB port was dropped by the Electron.

Behavior C:
It gets completely around the cycle, completes the .on, .connect, and .off, and will even do it 2 or 3 cycles every 20 seconds, but then it starts doing either A or B until it hangs again.

BTW: I am using version 0.5.3-rc.2, although I've used a handful of other firmware versions with equally bewildering set of behaviors that are never the same.

Thank you for any help. Please tell me what I'm doing wrong, or how to keep this program responsive. Or perhaps some other things I can try.

#include "application.h"

SYSTEM_MODE(MANUAL);

#define TRY_FREQUENCY (20UL * 1000UL)
uint32_t timeOfLastSend;

void setup()
{
  Cellular.off();
  Serial.begin(9600);       // USB
  delay(4000);
  timeOfLastSend = millis() - TRY_FREQUENCY + 10000; // Start 10 seconds behind
  Serial.printf("Starting celltest\n");
}

bool connect()
{
  unsigned int timer;

  Serial.println("Cellular.on()");
  timer = millis();
  Cellular.on();
  Serial.printf("On complete in %d\n", millis()-timer);
  delay(1000);

  Serial.println("Cellular.connect()");
  timer = millis();
  Cellular.connect();
  Serial.printf("Connect complete in %d\n", millis()-timer);
  delay(1000);
  
  Serial.println("Cellular off");
  timer = millis();
  Cellular.off();
  Serial.printf("Off complete in %d\n", millis()-timer);
  delay(1000);
}

int countdown = 0;

void loop()
{
  int now = millis();
  if (now - timeOfLastSend > TRY_FREQUENCY) {
    Serial.println("Connect");
    connect();
    Serial.println("Connect complete");
    timeOfLastSend = millis();
  } else {
    int secsLeft = (TRY_FREQUENCY - (now-timeOfLastSend)) / 1000;
    if (secsLeft != countdown) {
      countdown = secsLeft;
      Serial.printf("Connect in %d\n", countdown);
    }
  }
}

Have you seen this thread?


It looks like a TX buffer overrun, which should not happen due to the default behaviour of Serial.print() should be to block if there is no free space left
https://docs.particle.io/reference/firmware/electron/#blockonoverrun-

Unless some buggy code messes up the head/tail pointers.

Just for completeness, your int now = millis() should actually be unsigned.


As a side note:
Since the actual TX doesn't happen with the Serial.print() statement but async to your code Cellular.off() (and its effects) might be the source of the TX corruption, which might be something to investigate (@BDub ?)

@ScruffR, just a note - millis() returns an unsigned long (aka uint32_t) .

The biggest issue I see with @rvnash’s code is that there is no waiting for actions to complete. So when doing a Cellular.connect(), there should be a loop waiting for Cellular.ready() to return true. As it stands, the delay will most likely not be enough. The sample code posted by @rickkas7 is an excellent set of tools.

Thanks @peekay123 and @ScruffR,

@peekay123, the only call I make after Cellular.connect is Cellular.off. Are you suggesting that you can’t call Cellular.off until Cellular.ready is true? If so, that’s not documented anywhere that I read.

Even so, I’ve actually tried that. I’ve tried THREADED mode. I’ve tried putting waitFor and/or waitUntil loops in there, as you suggest. In fact I’ve tried dozens of variations as I’ve been working on this for a few weeks. I’ve disconnected the battery many times. This example I gave here was basically the simplest thing I could think of to isolate the problem. None of my attempts have resulted in predictable stable behavior. The same code behaves differently in each run. However it always eventually crashes (disconnects the USB) or hangs (never returns) at some point. The corruption of the Serial USB output is the least of my concerns, but I thought it might be indicative of some firmware memory corruption going on.

My conjecture is that either my Electron is faulty (I only have one to try), or the firmware is buggy, or the documentation is incomplete. All I want to do is be able to turn off and on the Cellular functionality, and not have the code hang or crash.

The sample code from @rickkas7 I’ve taken a look at, but it seems to be about connecting after waking. I’m not doing the sleep/wake problem, nor am I using Particle.connect. So much of it seems unnecessarily complicating my very simple on/off problem.

Like I said, thanks for the reply, but it doesn’t seem to be getting me anywhere. I’m willing to do the experiments,but is anyone at Particle willing to run my code and tell me why it crashes and/or hangs?

Yes, and unsigned int is exactly that, isn't it? (it's 32bit µC)

There actually is no difference between int and long on these processors.

I tried this code and it works fine for handling timeouts during connecting. I used an Electron with a u.fl to SMA connector on the antenna, so I could easily disconnect the antenna and it seems to work fine for me.

#include "Particle.h"

SYSTEM_MODE(MANUAL);
SYSTEM_THREAD(ENABLED);

// How often to check in milliseconds
const int CONNECT_CHECK_PERIOD_MS = 120000;

// How long to wait before considering a connect to have failed
const int CONNECT_TIMEOUT = 60000;

// How long to stay connected once we successfully connect
const int STAY_CONNECTED_TIME_MS = 2000;

// Finite state machine states
enum {
	STATE_START,
	STATE_CONNECTING,
	STATE_READY,
	STATE_DISCONNECT_WAIT,
	STATE_DISCONNECT,
	STATE_RETRY_WAIT
};

//
int state = STATE_START;
unsigned long stateTime = 0;
unsigned long startTime = 0;

void setup() {
	Serial.begin(9600);

	// This waits 10 seconds before doing the first connect so you have time to enable the serial monitor
	state = STATE_RETRY_WAIT;
	stateTime = CONNECT_CHECK_PERIOD_MS - 10000;
}

void loop() {
	switch(state) {
	case STATE_START:
		Serial.println("connecting...");
		startTime = stateTime = millis();
		Cellular.on();
		Cellular.connect();

		state = STATE_CONNECTING;
		break;

	case STATE_CONNECTING:
		if (Cellular.ready()) {
			// Cellular connection established and have an IP address
			state = STATE_READY;
		}
		else
		if (Cellular.connecting()) {
			if (millis() - stateTime >= CONNECT_TIMEOUT) {
				Serial.println("timeout connecting");
				state = STATE_DISCONNECT;
			}
		}
		else
		if (Cellular.listening()) {
			// This usually happens if you have no SIM card
			Serial.println("listening mode");
			state = STATE_DISCONNECT;
		}
		break;

	case STATE_READY:
		Serial.printlnf("connected successfully %lu", (millis() - startTime));
		state = STATE_DISCONNECT_WAIT;
		stateTime = millis();
		break;

	case STATE_DISCONNECT_WAIT:
		// In this state, the cellular connection is up, you can do IP stuff here
		if (millis() - stateTime >= STAY_CONNECTED_TIME_MS) {
			state = STATE_DISCONNECT;
		}
		break;

	case STATE_DISCONNECT:
		Serial.println("disconnecting...");
		Cellular.disconnect();
		Cellular.off();
		state = STATE_RETRY_WAIT;
		stateTime = millis();
		break;

	case STATE_RETRY_WAIT:
		if (millis() - stateTime >= CONNECT_CHECK_PERIOD_MS) {
			state = STATE_START;
		}
		break;
	}
}


4 Likes

Hi @rickkas7,

That’s exciting to hear that this code works! I will definitely try it on my Electron when I get the chance. Perhaps the difference here is you let “loop” return while waiting for Cellular.ready() while I was holding it in a waitFor() loop. Perhaps that’s the difference.

Can someone confirm that Cellular.off() will not work if electron is in green flashing “Looking for Internet” mode ?
Seems I can get it to white breathing (cellular off) mode for a few seconds the first time I try (in flashing green mode)
Then it goes back to flashing green by itself after a couple seconds and won’t respond to Cellular.off() at all from then on.

I’m trying to put some fault tolerancy in my app to reset the cellular completely, but seems once in Looking for Internet mode, there’s no app control of the cellular module from there.

I’m have
SYSTEM_MODE(MANUAL);
SYSTEM_THREAD(ENABLED);
in my code. Not using the particle cloud so just looking for breathing green ultimately.

Update…after a bit more testing, the issue is when in connecting state any cellular.off and cellular.disconnect commands are blocked and actually seem to be queued up which makes for even messier results once connected. Really need a way to unblock apps to the cellular connecting state.

@rvnash I would be higly interested in knowning if and how you resolved this issue with cellular connection cycling ? I am considering using the Electron in production but I’m a little concerned about what you have experienced.

Thank you,

@TimHockley did you manage to solve this issue somehow? I’m also experiencing similar problems; when dropping connectivity due to poor signal, and the Electron stays in that area for a longer period of time, going back into areas where signal is good the Electron just keeps flashing green without being able to reconnect to the cloud. I’m also unable to call Cellular.off to try and reboot the GSM-module. The only solutions I found so far is manually resetting the device, which really is sub-optimal for my application.

When your device is flashing green it tries to connect. So it would be adviasble to first end any of these connection attempts via Cellular.disconnect() and only then when Cellular.connecting() == false call Cellular.off(), but since the module needs some time to comply with the network demands, some blocking behaviour is to be expected.

All right. I try adding

Cellular.disconnect();
waitUntil(notConnecting);
Cellular.off();

with

bool notConnecting() {
    return !Cellular.connecting();
}

Will let you know if it works.

Nope. I’ve reduced my issue down to just the Cellular.off() command which doesn’t seem to work as advertised. Looks like background processes turn it back on vs waiting on a Cellular.on() command. Even worse it seems to do strange things if you continue trying multiple cellular.off()s. The folks in support admit there’s an issue and are working on it. So for now it appears there’s no way to do a manual cellular reset process which leaves me with an occasional lock up situation that the background process is not able to handle and restore communications. Seems the cellular.off() just needs to shut down any background cellular processes which doesn’t happen currently. Think I tried the waiting on not connecting status and that didn’t do any good for me.

1 Like

Thanks for your reply @TimHockley. Did try the solution above which didn’t really solve anything. Have you looked into disregarding particles own functions to use AT-commands (Cellular.command) instead?

Well - might be a stupid question, but it’s been awhile since I’ve looked at this.
Would the System_Thread(disabled) disable all the background cellular processing ? IE: would boot with cellular off ?
Thus then able to control the cellular manually without background system processes interfering ?

If anyone is searching, Adding my two cents worth of how I worked around this issue.

Seems the backend System firmware gets confused when you turn on connect and disconnect and reconnect really quickly. To the point of making the system useless.

On observation I found that the Cellular.connecting doesn’t truely reflect that you are actually connected. Only by checking the signal strength with an AT+CSQ and seeing that you have actually got a signal, does it connect.

So check for that.

Another thing noted is for some reason the Backend System Firmware wants to Resume vs Reinitializing the MDMParser, so it tries to reconnect to the network on switching on the modem instead of waiting for you to tell it to do it.This needs to be fixed so that if you power off, and you power on again, it re initializes everything, not resuming where you left off. If we want to resume, should we not have a hibernate or sleep method instead?

1 Like