How to avoid Loop() blocking if wifi connection is lost?

MarkusL · December 27, 2014, 1:28am

Hello,

I discovered that if the internet connection is lost the main loop is blocking for up to 7 seconds. To repeat run the code below and turn your router on and off again.
This is the code I used to test this:

void setup()
{
  Serial.begin(115200);
  Serial.println(millis());
}

void loop()
{
  Serial.println(millis());
  delay(200);
}

Is there a way to avoid this? Maybe with a timer task?
I have pieces of code that need to be called at least every 16ms to perform a check.

Markus

MarkusL · December 28, 2014, 12:25am

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................2334
.........1178
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................2334
.................2334
....20813
.....2333
.....................2334
1152
........................2334
............1144
......................................................2334
1131
.........................................................................2334
.1126

I did a 'recording' of the delay loop() is facing. This time I was not playing with the router or had anything else interrupting internet access. Each dot means no delay within 2 secs. The number is the delay in ms. At some point the block lasted 20 seconds !
The data collection started about 11:15am pst. and represents the first hour. The remaining 5 hours are without any blocking.

BDub · December 28, 2014, 1:21am

Interesting that one repeats a bit. The 20 second one is likely the Cloud loosing connection all on it's own and timing out after 20 seconds.

The check that happens every 16ms, how long does it last and is it required to run continuously? The more info you can share the better I'll be able to come up with a solution for you

MarkusL · December 28, 2014, 7:41pm

What I am building is described here.

It’s basically a UPS controller that has 2 AC line monitors. I am detecting zero crossing of the AC signal. That is done with 2 interrupts. But the code deciding what to do if one power line is dead does not run during the interrupt and waiting 2-20 seconds to switch to another power source is definitely too long.
How I see this now means using one of the hardware timers to run some stripped down ‘emergency’ code inside the interrupt. I am just not sure how much time I can spend inside the interrupt.

harrisonhjones · December 28, 2014, 11:20pm

If I recall correctly you can essentially spend as much time as you want in an interrupt if you wish. The reason people are warned about running lots of code in an interrupt varies but my personal reason is that I would be worried I might accidentally trigger the interrupt inside of the interrupt’s routine and, if that occurred enough times, it might crash the stack/heap and hard fault the processor.

So, if you have a good way to keep yourself from constantly retriggering the interrupt while you are in the interrupt (should be do-able) then you should be fine. Other processing might suffer but if you keep the interrupt level low enough priority I imagine it would work

peekay123 · December 29, 2014, 2:27pm

@MarkusL, IMO all power control should be done locally and not depend on the wifi/cloud connection. The latter should be used for reporting, manually switching modes, etc.

If I understand correctly, you want to switch supplies when and AC line goes dead which means, I assume, within a half or full cycle of the AC signal. That is, within 8 to 16ms, correct? Am I correct in assuming that you want to use a hardware timer as a “watchdog” that gets reset on every zero crossing and when it is not (reset), it fires an interrupt that you can use for switching to backup power?

MarkusL · December 29, 2014, 7:27pm

@peekay123, yes that is the goal - no control of the unit via cloud is planned, only transmitting the status. But I am currently not sure this can be realized in a secure way, but that is another topic.
Your assumptions are correct about the response time. It seems the timer is now the only option to avoid the random delays I experienced.

peekay123 · December 29, 2014, 7:39pm

@MarkusL, so if I assume correctly, your zero crossing detect is done in an ISR and in that ISR you would keep resetting a hardware timer. If that timer runs down, then it means you have stopped receiving zero crossings and the timer interrupt fires, allowing you to do a switch of the power assuming it is a simple I/O operation. The timer can then disable itself so it won’t retrigger and set a flag for loop() to read and act upon. Once zero crossings are detected again and loop() runs, it can decide to switch the power back if desired. Since both zero-crossing and timer are interrupt driven, they will operate regardless of any blocking of the user loop() code. I am assuming that power to the Core will not be interrupted during the switch!

MarkusL · December 29, 2014, 7:52pm

@peekay123, now thinking more about your timer solution creates another problem: I would need 2 timers to identify which of my 2 lines lost power. by looking at the HW specs for the timer I see that I only have pins D0,D1 unused so I can only use TMR4. Is that correct?

BDub · December 29, 2014, 8:11pm

There are a bunch of timers and you can implement HW interrupts on a bunch of different pins, and timer interrupts with a few timers… I’m going to look into making a simple example. Please do try to beat me to it though

peekay123 · December 29, 2014, 8:17pm

@MarkusL that is correct IF you don’t need timer functions on the other pins. Otherwise you can use any of the three timers.

You can use a single timer for two AC lines by letting that timer interrupt fire all the time but keeping two counter variables that get reset by the zero crossing ISR. The timer ISR would decrement each counter and as one reached zero it would trigger the necessary event and set a flag for loop(). It could also not retrigger by reading that flag so the only way it would restart is by having loop() reset it. The same would apply with the zero crossing ISR where it would not reset the counter until the flag is reset by loop().

MarkusL · December 29, 2014, 10:38pm

@peekay123, I realized now that I can also use timer TMR2 since I am not doing PWM on it. But there is still this uncertainty about how long - in the worst case - loop() could be blocked.
Running the line monitor is only one aspect, I am also monitoring the temperature and the current that is flowing thru the unit. In case the ‘safety margins’ are exceeded I need to shut down the unit. I am now more inclined to do all processing in a timer interrupt.
Is the cloud or any other firmware using the interrupts?

BDub · December 30, 2014, 12:15am

Here’s a dual watchdog timer example that you should be able to tweak to suit your needs as described above. It uses @peekay123’s Interval Timer library which is available in the Spark IDE Libraries as well:

You can copy/paste the follow code into a new Spark IDE app, then add the Spark IntervalTimer library to it.

// Spark Interval Timer Library - Watchdog Demo
// BDub - 12-29-2014
//
// This demo will create two Interval Timers to count down individual 
// variables every 1000us (1ms).  If either ever gets to 0, the respective 
// watchdog lights an LED.  Switches to GND on D0 and D1 will constantly 
// reset the respective watchdog timer every time they go low.  The 
// counters were used because the watchdog timers were firing during 
// creation, so the counters effectively give a bit of wiggle room during setup.

#include "application.h"

#include "SparkIntervalTimer/SparkIntervalTimer.h"

// fast pin access
#define pinLO(_pin) (PIN_MAP[_pin].gpio_peripheral->BRR = PIN_MAP[_pin].gpio_pin)
#define pinHI(_pin) (PIN_MAP[_pin].gpio_peripheral->BSRR = PIN_MAP[_pin].gpio_pin)
#define pinSet(_pin, _hilo) (_hilo ? pinHI(_pin) : pinLO(_pin))

#define kickMyWatchdog1() { \
  __disable_irq(); \
  myCount1 = 10000UL; \
  __enable_irq(); \
} 

#define kickMyWatchdog2() { \
  __disable_irq(); \
  myCount2 = 10000UL; \
  __enable_irq(); \
} 

// Create 2 IntervalTimer objects to be used as Watchdog Timers
IntervalTimer myWatchdog1;
IntervalTimer myWatchdog2;

// Pre-define ISR callback functions
void myWatchdogISR1(void);
void myWatchdogISR2(void);
void myButtonISR1(void);
void myButtonISR2(void);

volatile int turnOff1 = LOW;
volatile int turnOff2 = LOW;
volatile unsigned long myCount1 = 10000UL; // use volatile for shared variables
volatile unsigned long myCount2 = 10000UL; // use volatile for shared variables

// functions called by IntervalTimer should be short, run as quickly as
// possible, and should avoid calling other functions if possible.

// ISR for myWatchdog1
void myWatchdogISR1(void) {
  if (myCount1 > 0) myCount1--;	  // decrease count
  else {
    if (turnOff1 == LOW) {
      turnOff1 = HIGH;
      pinSet(D7, HIGH); // fast pinset high
    }
    else {
      turnOff1 = LOW;
      pinSet(D7, LOW); // fast pinset low
    }
    myCount1 = 10000UL;
  }
}

// ISR for myWatchdog2
void myWatchdogISR2(void) {
  if (myCount2 > 0) myCount2--;	  // decrease count
  else {
    if (turnOff2 == LOW) {
      turnOff2 = HIGH;
      pinSet(D2, HIGH); // fast pinset high
    }
    else {
      turnOff2 = LOW;
      pinSet(D2, LOW); // fast pinset low
    }
    myCount2 = 10000UL;
  }
}

// ISR for D0 FALLING HW input
void myButtonISR1(void) {
    kickMyWatchdog1();
}

// ISR for D1 FALLING HW input
void myButtonISR2(void) {
    kickMyWatchdog2();
}

void setup() {
  pinMode(D7, OUTPUT);
  pinMode(D2, OUTPUT);
  pinMode(D0, INPUT_PULLUP);
  pinMode(D1, INPUT_PULLUP);
  pinSet(D7, LOW);
  pinSet(D2, LOW);
  
  // allocate myWatchdogISR1 to run every 1000us (1000 * 1us period)
  myWatchdog1.begin(myWatchdogISR1, 1000, uSec, TIMER3); // could be TIMER2, TIMER3 or TIMER4

  // allocate myWatchdogISR2 to run every 1000us (1000 * 1us period)
  myWatchdog2.begin(myWatchdogISR2, 1000, uSec, TIMER4); // could be TIMER2, TIMER3 or TIMER4
  
  // allocate myButtonISR1 to D0 FALLING edge
  attachInterrupt(D0, myButtonISR1, FALLING);
  
  // allocate myButtonISR2 to D1 FALLING edge
  attachInterrupt(D1, myButtonISR2, FALLING);
}

void loop() {
  // nothing ... or ..

  // you could use this code instead of the button inputs to kick the watchdog
  if (Spark.connected()) {
    kickMyWatchdog1(); // If user code stops running or we lose a cloud connection, the watchdog ISR will reach 0 and you can execute whatever code you need to
  }
}

MarkusL · December 30, 2014, 12:32am

@BDub Thanks for writing this down! Where do I find a list of the functions you are calling like: pinSet() and __disable_irq()
I am not able to find any documentatin about them.

BDub · December 30, 2014, 1:18am

pinSet() is a macro that’s defined at the top of the code. It works exactly like digitalWrite() but super fast.

__disable_irq(); globally disables ALL interrupts, not just user interrupts. noInterrupts() only disables the user interrupts, and it was unclear to me if the Spark IntervalTimer Library used interrupts that were defined in the Core firmware as “user” so I just disabled them all to modify the counters.

kickMyWatchdog1() and kickMyWatchdog2() are also just macros.

Please let us know if you can make this work. I want to test this when the loop() blocks as well to make sure these do in fact run all of the time.

EDIT: With some code in the loop() this watchdog can be pretty useful to detect if the Wi-Fi/Cloud is still connected or not

void loop() {
  if (Spark.connected()) {
    kickMyWatchdog1(); // If user code stops running or we lose a cloud connection, the watchdog ISR will reach 0 and you can execute whatever code you need to
  }
}

peekay123 · December 30, 2014, 2:28pm

@BDub, you could use a single timer and service both watchdog counters since the timer itself is just free running

BDub · December 30, 2014, 4:22pm

@peekay123 Yeah, for sure. I did it that way specifically for @MarkusL’s dual AC setup. I didn’t know if he would need two independent and slightly different zero crossing detectors or not. If the granularity of the timer can be used to create different counters, you can jam it all into one interrupt… but now they get processed in the order of how the code is written, instead of the order in which they are firing. Probably doesn’t matter much for any longish counts. All that said, optimization is ideally up to the end application

peekay123 · December 30, 2014, 4:31pm

@BDub, you are optimal my friend

MarkusL · December 30, 2014, 6:31pm

@peekay123, I think @BDub is right because I have 2 AC lines and don’t see how that could be solves using one interrupt. I think I did not mention that I am not only interested sensing if there is power but also what the phase shift between the phases is.

peekay123 · December 30, 2014, 6:46pm

@MarkusL, the timer is only used to “watch” the watchdog counters. Even if both AC lines are out of phase by 180 degrees, that is still 1/2 cycle or 8ms. You will need at least 1/2 to a full cycle to know if the power has failed. So if your watchdog counter is set to 16, the watchdog counter will go to zero within 17ms (1ms timer).

The watchdog is not looking at the phase though in theory, if you compare the watchdog counters and assuming they are reset on a zero crossing event for each AC line then you can calculate the phase angle difference in milliseconds which then translates to degrees! One or two timers is your choice but I believe that a single timer can be used

Topic		Replies	Views
WiFi-error blocking my program? Troubleshooting	5	1855	March 24, 2015
Known issue: long delays or blocking code kills the connection to the Cloud Troubleshooting	39	11530	May 25, 2016
Cyan breathing as a watchdog timer to do a software reset Firmware	25	5484	September 3, 2014
Create time out for Spark.connect()? Firmware	43	11270	August 31, 2016
Event driven loop/sleep question Firmware	26	4431	April 25, 2014

How to avoid Loop() blocking if wifi connection is lost?

Related topics