Blocking Particle.connect() and publish()

You can find Watchdog chips with wider pet the dog inteverals if you do not want to wake up every 5 mins to signal it.

My understanding from the comments in your other Post was this would best be accomplished with MANUAL Mode :

But you have reasons to not want to modify your code for Manual Mode.

So, If you use an external watchdog, you still have the possibility of the Electron burning through battery for the full length of your timer interval during a Code Crash (1 hour cycle would be too long, especially considering the next connection attempt could fail/lockup).

If you perform 5 min Deep Sleep with an external watchdog as @RWB suggests, the only thing the Electron would do every wake cycle would be to Pet the H/W watchdog, increment a counter, and enter another 5 min Deep Sleep until it's time to publish. That seems realistic without requiring too much modification of your Code and shouldn't take too many milliseconds of precious battery power.

I guess you have to decide what's the power cost of waking at a short interval to kick the external watchdog, verses the power lost during a LONG interval while your Electron is "locked-up". But either way, the external watchdog "should" eliminate the need to manually reset the Electron.

Or Manual & Threaded using (waitFor(Particle.connected, connectionFail) and not have a hardware revision :sunglasses:

3 Likes

Switching to Manual mode is easier than adding more HW to my design, so I would favour that, but I wasn’t 100% sure if Manual mode is a 100% failsafe method for blocking code. Reliability is key for my application. I believe Particle.connect() and Particle.publish() are the only two blocking commands that cause issues for me right now, but it’s hard to be sure.

I give my devices up to 3mins. to connect - so a 5min. timer seems appropriate to me. A timer like 1h is too long if the code blocks more than just a few times per year. Like you said, my idea would be to have 2 timers in my FW - one that fires every x min. to pet the HW watchdog, and one that fires every 1h. to publish.

But 5min. right now is a random number - I’ll have to do the calculations to see what the most optimal number is power-wise.

1 Like

Some of the Elites would have to speak to that, but my guess is that external watchdog would be the only way to approach 100%. But for me, it's [Manual/Threading] worked for a few projects that were extremely sensitive to wasting precious battery (primary, no recharging available). But then again, I don't have 50 Electrons running Manual/Threaded for a decent sample size.

I don't know what all your code does, but what impact would using Manual/Threading, but No timers, No ApplicationWatchdog, no external watchdog have? It would just be a 1 hour deep sleep, wake up, allow up-to 3 minutes for a successful connection, then go back to deep sleep for 1 hour no matter what (the 1-shot approach)? You could throw in a system.reset every 24 hours or once a week.

But there will always be a tiny chance that the cell modem can get stuck in a funny state and continue to waste power. H/W watchdog would be the best thing that I can think of to mitigate that, but you will need to recognize that situation first (I'm not sure how to).

3 Likes

That would work great. The 1h Deep Sleep is exactly what my devices do right now, so if Manual can address the blocking issue essentially nothing else will need to change about my design.

I will definitely start switching my FW to Manual, and likely will implement the HW Watchdog too, for 100%- bulletproof sake (+ it looks like a fun mini-project).

3 Likes

So I did some calculations, and want to go with a setup that resets my device every 20mins (unless the dog was petted). The type of reset doesn't matter much, because my device should go into Deep Sleep mode right away anyway (unless the accelerometer actually detects movement).

These are the latest schematics from the thread you linked:

Schematic part 1: Circuit Protection - I already have this, using the TPS61099.
Schematic part 2: Power Control - I don't need extra Power buttons.
Schematic part 3: Carrier Board - I don't need a Temperature sensor or FRAM. Just the Watchdog.

Am I correct in thinking that the only part from this entire schematic I need would be the TPL5010 (Watchdog) with its 3 resistors (values appropriate for a 20m. timer)?

Looks like all you need is this with the resistor combo that gets you the delay your looking for.

4 Likes

Thanks, confirmed what I was thinking :slight_smile:

2 Likes

Just make sure you read the data sheet since there are usually some pretty good tips in them that you may miss otherwise.

4 Likes

I was curious if you (or anyone else) could explain to me if there’s any difference between these two pieces of code? With my original code, I would still have the occasional blocking Electron that required a manual reset to be done.

Original code:

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);

void loop() {
....
    if (!connecting) {
        Particle.connect();
        connecting = true;
        }
    if (Particle.connected()) {
        Particle.publish(publish, data, PRIVATE);
        ...    
        }
    else if (millis() - stateTime >= 180000) {
        Cellular.off();
        delay(2000);
        trueReset();
        break;
        }
...

Revised code:

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);

void loop() {
....
    if (!connecting) {
        Particle.connect();
        connecting = true;
        }
     if (waitFor(Particle.connected, 180000)) {
        Particle.publish(publish, data, PRIVATE);
        ...    
        }
    else {
        Cellular.off();
        delay(2000);
        trueReset();
        break;
        }
...

The waitFor will timeout and move to else after 3 minutes

1 Like

So when Particle.connect() ends up blocking, it still wouldn’t time out after 3 minutes would it? Seems like both pieces of code effectively do the same thing. I’ve had devices block with my Original code (System Threading + Manual) so I’m guessing FW-side there isn’t much else to do here to reduce the issue?

I used to just have the Electron go back to sleep until the next publish event when the connection timed out.

I was sending data every 5 mins so the wake up would not take to long.

1 Like

I’m probably missing something here, but I would think your “Revised” Code should work.
If a Publish was missed using the Revised Code, maybe it was just that the Electron couldn’t connect to the cellular network during the 3 minutes (poor signal, etc) ?

I would guess you would rather go to sleep in that case verses Reset.
It was suggested to me in another Thread to skip the “else” and go to sleep after the waitFor, since it either Published or Not (didn’t really matter). Again, I’m not sure if this helps in your Project.

When the connection attempt fails, I reset the device up to 3 times, before I put it back into Sleep mode. I did this, because my devices don’t publish much but when they do it’s pretty important they do so successfully.

Anyway, publishes being missed isn’t the real issue here to me, as that’s simply related to overall product constraints. The real problem I have is that after I call particle.connect() in my Original Code, the device occasionally blinks green endlessly for hours, sometimes until the battery goes dead alltogether. Usually the 3-minute timer described in my code kicks in correctly and prevents this, but not always.

As far as I understand, since particle.connect() is blocking, timers aren’t failproof and there isn’t a good way (?) to mitigate the issue FW side. Hence why I’m already adding in the HW Watchdog in my next version.

Note: trueReset() in my code puts the device into Deep Sleep for 30 seconds before waking it up. Any connection failures are therefore always followed by Deep Sleep.

Check the battery and voltage converter voltages during this constant flashing green connection issue because I saw that happen when the battery voltage was low and not when it was higher.

Plus it’s a cold time of year so colder temps usually cause battery voltages to drop vs warmer temps. Cellular RSSI will help determine if your the device has weak cellular signal as weather can affect signal strength also.

It’s definitely an RSSI issue. I have my devices deployed in a few locations. At our main site I’m getting RSSI values of -50 to -65 and things are going great; the couple of sites I’m having issues at show values of -80 to -95.

I’d rule out cold temperature as I had a few of them in a freezer (-18C) for months, and the voltage output remained satisfactory. In the field we usually don’t see the temperature drop below -5C.

The RSSI is one of the reasons I decided to go with Bell SIMs over the Particle ones. Can’t wait to test those and see whether it improves the situation.

Sounds like a RSSI issue then.

I would save the date and time of the data you are trying to send in retainer memory and only clear it after a successful publish. That way you can just go back to sleep if it does not connect and know the data will eventually make it on the next successful connection.

Sounds like you have ruled out cold weather as the issue which is good.

1 Like

I already save my data for the next publish :wink:

The issue really is when the Electron sits there for 1h+ blinking green before finally (a) deciding to go to Sleep (b) running out of battery juicd.

@Vitesze, have you experienced an Electron doing this when using
if (waitFor(Particle.connected, 180000)) in Threaded + Manual Modes?

I'm asking because I have Electrons deployed in a similar situation (very sensitive to wasting battery power) and haven't noticed this. But as I said before, I don't have near as many deployed as you do.

I step through the modem connect/disconnect process with "safe" delays (seen here), but I've also read where that shouldn't be necessary.
Maybe you need the "safe" delays with the Particle.process() calls?
1 Sec Delay after each action => for (uint32_t ms = millis(); millis() - ms < 1000; Particle.process());

Here's my Generic Flow for 1-shot Manual Mode:

 Cellular.on / 1 Sec / Particle.connect / 10 Sec /
      / waitFor Timeout / Publish / 5 Sec /
 Particle.disconnect / 1 Sec /  Cellular.off / 1 Sec /  Deep Sleep

I even tried disconnecting the Cellular antenna during the connection process many times and could never get the Electron to hang-up and not go to sleep.
This Code Flow "should" mitigate your RSSI issues at the problem sites (similar to no antenna).

2 Likes